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Preface 


This book is intended as an elementary introduction to linear algebra 
and matrix theory for first year students in Universities and Polytechnics. 
It is suitable for intending mathematicians, scientists, engineers and 
social scientists who require a thorough grounding in linear algebra and 
matrix theory as part of their course. It has been used in a class which 
included both mathematicians and scientists and is based on a first year 
course given at the University College of Wales, Aberystwyth over a 
number of years. 

Although there are many excellent books already available on this 
subject, the author has not found one which is totally suitable for use 
at this level. They tend in general to be too long and encyclopaedic and 
the approach too abstract to be used as a course text in a first introduc¬ 
tion to the subject. The approach used in this book emphasises the 
computational and practical aspects of the subject, but at the same time 
it aims to give a thorough and rigorous introduction to the subject at a 
level which is suitable for use as a first introduction to abstract concepts 
in mathematics. The book is structured so as to first give the more 
concrete and practical aspects which lead naturally to the more abstract 
ideas developed later. 

The book’s six chapters cover linear equations and matrices, 
determinants, vector spaces, linear transformations on vector spaces, 
inner product spaces and diagonalization of matrices and linear trans¬ 
formations. The first two.chapters, after first motivating the discussion 
by considering 2x2 and 3x3 matrices and determinants and the 
corresponding linear equations, proceed to develop these concepts for 
general m x n arrays. The main emphasis is in providing efficient and 
effective techniques for solving linear equations, expanding determi¬ 
nants, manipulating matrices, etc. The later chapters are more abstract 
in character, but each new abstract concept is motivated by considering 
the relevant concept in 2- or 3-dimensions. The aim is to show that the 
definitions of these abstract concepts are the natural and useful 
extensions of what occurs in lower dimensions. Furthermore, it is 
hoped that the student will develop an appreciation that abstraction 
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not only adds elegance to the subject but that it also leads quickly to 
effective techniques for dealing with certain practical problems. The 
exercises have also been selected to give the reader practical experience 
of the new concepts as they are introduced. They have only rarely been 
used to develop additional theory as is sometimes customary. 

A rudimentary knowledge of algebra is assumed, e.g. complex 
numbers, fields, set theory, equivalence relations and mappings, which 
should be available to the reader through a concurrent or preceding 
course (or for some from their A-level courses). A general reference is 
P. J. Higgins—A First Course in Abstract Algebra, VNR New Mathemat¬ 
ics Library 7. 

The book has benefited from the detailed reading of my colleague 
Mr. Meurig John; I am grateful to him for his ready assistance as I am 
to Mrs. Noreen Davies who patiently typed the manuscript. 

The University College of Wales Alun 0. Morris 

Aberystwyth 

March 1977 


Preface to the Second Edition 

The second edition of this book is largely a reproduction of the first 
edition. I have taken advantage of the opportunity to clarify a few 
obscurities and to correct some minor errors in the first edition. I am 
indebted to my colleagues who have drawn my attention to these. 

The major changes have resulted in more examples of vector spaces 
and linear transformations with an analytic flavour, a complete re¬ 
writing of the effect of a change of basis on the matrix of a linear 
transformation and a fuller treatment on the application of quadratic 
forms to the classification of conic sections and quadric surfaces. At the 
suggestion of many readers, solutions for all the exercises have been 
included. 


The University College of Wales 

Aberystwyth 

June 1982 


Alun 0. Morris 
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CHAPTER 1 


Linear Equations and Matrices 


1.1 Introduction 


The reader is probably already familiar with some methods for solving 
systems of linear equations where a small number of equations and 
variables are involved. Applications of linear algebra to other branches 
of science, engineering, economics or elsewhere generally occur via the 
need to solve such systems of linear equations. It may be claimed that 
one of the main aims of linear algebra is to 

(i) find the most economic way of manipulating and solving such 
systems, 

(ii) obtain useful theoretical results concerning such systems. 

At an elementary level, the method normally used for the solution 
of a system of linear equations is one involving either the use of 
determinants or the use of an elimination process. In this chapter, our 
intention is to develop this second method to the general case. This 
turns out to be the most efficient method of dealing with such a system. 

As motivation for the work to be developed in the remainder of this 
chapter, we first illustrate the work by considering a few examples. 


EXAMPLE Find all the solutions 
equations: 

(i) jc ! — 2 x 2 + x 3 = 1 
2xi — x 2 + x 3 = 2 
4*i‘+ x 2 ~ x 3 — 1 
(iii) Xi — 2x 2 4- x 3 = 1 
2x x — x 2 + x 3 = 2 


of the following systems of linear 

(ii) Xi 4- x 2 = 2 
2xi + 2x 2 = 3 


(i) Subtract twice and four times the first equation from the second 
and third equations respectively, giving the system of equations 

Xi — 2 x 2 + x 3 = 1 

3x 2 — x 3 = 0 

9x 2 — 5x 3 = —3 


1 


1 







Now, divide the second equation by 3 and add twice the resulting 
equation to the first equation and subtract nine times the resulting 
equation from the third equation, giving 

*1 = 1 

*2 - 3*3 = 0 

— 2x 3 = —3 
which leads to the solution 

*1 =-i *2 = \ X 3 = § 

this is the unique solution of the original system of equations. What we 
have done is to replace in a systematic way the original system of 
equations by a system of equations “equivalent” to it whose solution is 
more easily obtained. 

(ii) These two equations are inconsistent, since subtracting twice 
one equation from the other leads to the fallacious statement 1 = 0, 
thus no solution exists in this case. 

(iii) The system of equations may be reduced to the system of 
equations 

*i +i* 3 = 1 
x 2 — %x 3 = 0 
or in other words 
*1 = 1 -\x 2 
*2 = 1*3 

If we now put x 3 = t , we find that whatever value we assign to t , we 
obtain a solution of the original system of equations. For example, if 
t = 0,x x = l,x 2 = x 3 = 0 is a solution and if t = 3, x x = 0, x 2 = 1, 
x 3 = 3 is another solution. In this case, not only does a solution exist, 
but there is more than one solution to the system. 

From the above examples, we note that a system may have 
(i) no solution, (ii) a unique solution, or (iii) more than one solution. 
Furthermore, the process of solution is to use a systematic method of 
elimination which leads to a “simpler equivalent” system of equations 
whose solution is more easily computed. 

Initially, we shall be most interested in the technique of solution of 
linear equations, later we shall take up further the questions 

(i) When do solutions exist? 

(ii) When is a solution the unique solution? 
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1.2 Elementary Row Operations on Matrices 

Let K be a field. 

DEFINITION 1.1 A rectangular array of elements of K 



is called a matrix. For i = 1, 2 ,. . . , m, let 


r i ^i’2> * * * ’ **i/j) 

and for j = 1, 2 ,. . . , n, let 



th^n the r- (i= 1,... ,m) are called the rows of the matrix and the 
c- (/ = 7, ... , n) are called the columns of the matrix. A matrix with 
m rows and n columns is called an m x n matrix. The element ofK at 
the intersection of the ith row and jth column is called the (i, j)th 
element of the matrix. The matrix is written in abbreviated form as 


or A = (a f/ ) 


^ X n 

EXAMPLE 

/ 1 0 -3 1V 

2 1 3 1 

\ 1 0 11 / 

is a 3 x 4 matrix, where (2, 1,3, 1) is a row of the matrix and 
column of the matrix. 


3 lis a 


DEFINITION 1.2 The following are called elementary row operations 
on a matrix A = (a z - •), 
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(i) if 1 <;?n, a E K, a ^ 0 , multiply the ith row of A by a, 

(ii) if 1 < z, / < m, / ^ /, aGK, add a times the jth row to the ith 
row , 

(iii) interchange the ith and jth rows. 

If r • (z = 1, . . . , m) are the z?z rows of A , these three elementary 
row operations are written as follows 

(i) r, -> ar,., (ii) r t -* r t 4- a/y, (iii) r t *+r r 

DEFINITION 1.3 If A and B are two m x n matrices , then A is row 

equivalent to B if B can be obtained from A by a finite sequence of 
elementary row operations. 


EXAMPLE 

1 0 -1 1 
2 1 0 1 
\—1 1 0 2 


is row equivalent to 

/ 1 0-1 l\ 

-11 0 2 
\ 0 3 0 5/ 


since 


1 

0 - 

-1 1\ 


( 1 

0 

-1 

2 

1 

0 1 


+ 2 r 2 

N, 

0 

3 

0 



\-i 

1 

0 2/ 

1 0 

-i 1^ 

H 

1 

0 


r 2 ** r 3 

"v 

-1 1 

0 2 







I 






y o 3 

0 s) 






DEFINITION 1.4 An m x n matrix A is called an echelon matrix if 

(i) the first non-zero element in each non-zero row is 1 , 

(ii) the leading 1 in any non-zero row occurs to the right of the 
leading 1 in any preceding row , 

(iii) the non-zero rows appear before the zero rows. 

An echelon matrix is called a reduced echelon matrix if 

(iv) the leading 1 in any non-zero row is the only non-zero element 
in the column in which that 1 occurs. 
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EXAMPLE 

(l 0 -1 2 \ 

0 1 13 

\0 0 11/ 

is an echelon matrix, but not a reduced echelon matrix, 


/l 0 0 3 
0 10 2 
\0 0 1 1 


is a reduced echelon matrix, but 


0 

1 

0 

2\ 


(i 

0 

2 

°\ 

1 

0 

2 

0 

and 

0 

0 

0 

0 


are not even echelon matrices. 
We can now prove 


THEOREM 1.5 Every m x n matrix is row equivalent to an m x n 
reduced echelon matrix. 


PROOF ( Gauss Elimination Method ) Let the /th column of A , where 
1 </ < n, be the first column of A which contains a non-zero element. 
Suppose that a i; - i= 0, where 1 < z < m, then the elementary row 

operation r i -+ — r- followed by the elementary row operation 
a ij 

r x *+ y. gives a matrix with 1 in the (1,/)-position, viz: 


1° ' 

. . 0 

1 

01./+1 • • 

• Pin \ 

0 . 

. . 0 

h 

^2,; + l • • 

@2 n 

# , 

\o . 

. . 0 


@m,j +1 ' ‘ 

• $mn 1 


Now, the elementary row operations r k ^ r k~Vkj r i (k = 2,..., m) 
result in the matrix 
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y i,/+i ■ ■ 

7 \n 

y 2,j+ X • • 

^2 n 

3m ,i + 1 ' ' 

7mn 


If at this stage, some zero rows have appeared, row interchanges can be 
used so that the non-zero rows appear before the zero rows. We now 
repeat the same process with the last m — 1 rows of this matrix to form 


0 . 

. 0 

1 

5 1,/+1 • 

• ^1, k —1 

^1 k 

^lyk+1 • 

• 5 ln\ 


0 . 

. 0 

0 

0 

. 0 

1 

^2, k + 1 • 

• ^ 1 


0 . 

. 0 

0 

0 

. 0 


^3, k + 1 * 

• ^3 n 

(1) 

0 . 

. 0 

0 

0 

. 0 

^mk 

^m,k +1 ‘ 

■ 8 mn / 



After a finite number of steps a matrix which is in echelon form results. 
A matrix which is in reduced echelon form is obtained from this by 
carrying out a sequence of elementary row operations of the type 1.2 
(ii), for example, the sequence of elementary row operations 
r f -> r { — 8 ik r 2 (i = 1,3 ,,m) reduces the matrix (1) to the form 


0 . . 

• 0 1 e i,/+i * ■ 

• € l,k—l 

0 6 l,fc+l • • 

• e ln 

0 . . 

.000 

, . 0 

1 e 2,k + l • • 

• e 2n 

0 . . 

.000 

. . 0 

0 6 3,fc + l • • 

■ e 3n 

0 . . 

.000 

. . 0 

0 e m, k+1 • • 

* ^mn 


This completes the proof of the theorem. 

The above proof will be better understood if the following example 
is worked out in detail. 


EXAMPLE 



0 5 35 -24 f 

21-1 10 
3 2 2 -1 1 

0 0 0 0 0 

5 3 1 0 ly 


( 2 ) 
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The elementary row operations r 2 \r 2 , r i r 2 and r 4 ** r s gives the 
matrix 


/ 


0 1 
0 0 


0 

0 


,0 0 


5 

2 

3 

0 


35 

2 

1 

0 


h 0 

-24 1 

-1 
0 


0 0 


The elementary row operations r 3 r 3 — 3r lf r 4 -+ r 4 — 5 r x give the 
matrix 


1° 


1 


0 0 
0 0 
0 0 
vO 0 


h 

i 

0 


35 


* 0 
-24 1 


i - 
0 


5 1 

2 1 

1 

0 0 


Now, the elementary row operations r 2 r 3 , r x /q — r 2 , r 4 r 4 — r 2i 
r 3 r 3 ~ 10r 2 , r 2 2 r 2 performed in that order give the matrix 


0 1 
0 0 
0 0 
0 0 
0 0 


0 

1 

0 

0 

0 


-4 3 

7 -5 


0 

0 

0 


1 

0 

0 


-1 
2 
—9 
0 
0 


and finally, r x r x — 3 r 3 , r 2 -> r 2 + Sr 3 now gives the reduced echelon 

matrix 


0 1 


0 0 
0 


0 

1 


■4 0 26 

7 0 -43 


0 

0 


0 0 
0 0 
0 0 


0 1 
0 0 
0 0 


-9 
0 
0 / 


( 3 ) 


Note: It is important that these elementary row operations should be 
carried out in the order stated. 
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Exercises 1.2 

1. Find the reduced echelon matrices of the following matrices 

2 1 \ 

-1 1 
1 1 / 



(iii) / 2 
6 
4 
\ 2 

(V) 


2 

1 


-1 0 

0 1 



2 

4 
8 
4 

0 10 


(vi) 


i 

1 

— i 

1 - / / 0 \ 

-2 0 i 

-1+ / 1 1 


1 

l ~y/2 

o \T 

V2 

-3 

1+V2 —1 — 2\T ' 

-1 

V2 

-1 1 

j 

1.—2 

-2 +4V2 

-2-V2 3 - \ 2 ; 


2. Are the following pairs of matrices row equivalent ' 


(0 / 


o - 

-i\ 



3 

-1 

A 



2 

1 

0 



0 

2 

i 


\ 

\1 

-1 

l) 



ll 

-1 

J 


(ii) 

1 

-1 


1 

2 \ 

0 

-1 


2 


-2 

3 

0 

1 

1 

2 


-1 


1 

0 

— 

1 

3/ 

-2 

-5 


4 


2 

7 

3 

0 

2 
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1.3 Application to Linear Equations 

Consider the system of m linear equations in n variables x u . . . ,x n , 



O' = 1,... ,m) 


( 4 ) 


where a f -, then 


^ 041 (« + 1 ) 

are called the matrix of coefficients and the augmented matrix of the 
system respectively, i.e. 


A = 


an a 12 . . . ct\ n \ 

a 2 i a 22 • • • a 2 „ 


(A\ b) 


1 a m2 * * 


/an a 12 
a 2 i a 22 

\^m 1 


and 


• 

• • • a ln 

• • • a 2n 


. . a 


mn 



DEFINITION 1.6 An n-tuple (x x ,... , x n ) which satisfies each of the 
m equations in the system (4) is called a solution of the system. Two 

t 

systems of linear equations are equivalent if every solution of one 
system is a solution of the other system and vice versa. A system of 
linear equations is called a homogeneous system if = 0 (/ = 1,. . . , m). 
A system with at least one solution is called a consistent system. The 
solution (0, . . . , 0) of a homogeneous system is called the trivial 
solution. 


Our main result is the following: 

THEOREM 1.7 If {A'\b’) = is an m x (n + 1) matrix 

obtained from the m x (n 4- 1) matrix (A\b) by an elementary row 

H 

operation , then the systems E a. - x- = 0. (i = 1, . .. , m) and 

n , , i=1 
E a,/ X; = P/ (i = 1,. .. , m) are equivalent. 

7 = 1 
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PROOF We first note that if a matrix ( A'\b ') is obtained from a 
matrix (A\b) by an elementary row operation, then there exists an 
elementary row operation on {A'\b') which results in the matrix ( A\b ), 
this is the inverse of the original elementary row operation. The 
inverses of the three elementary row operations 

(i) r { -* ar i , a i= 0 (ii) r- -> r { 4- ar- and (iii) r. r ; - 

are 

(i)' r t -* a" 1 r,* (ii)' r t -► r t — ocr, and (in)' /*,- +* r, 

respectively. Also, whatever elementary row operation has been used to 
obtain the matrix (A'\b r ) from (A\b), the resulting corresponding 
system of linear equations is a linear combination of the equations in 
the original system, and so, a solution of the original system will also 
be a solution of the new system. For example, for an elementary row 
operation of type (ii), the only equation changed is the zth equation, 
where the equation 

a., x* -I- .. . 4- a- x = fi¬ 
ll 1 in n 

is replaced by 

Hi + + • • • + H„ + aa jn) X n = ft + “ft 

and thus, if (x l5 .. ., x n ) is a solution of the original system, it is also 
a solution of the new system. Conversely, by the above, each solution 
of the new system can similarly be shown to be a solution of the 
original system. 

There are some important consequences of this theorem which 
appear in the following corollaries. 

COROLLARY 1 If £ a-, x, = (/ = 1,..., m) is a system of 

i =i 

linear equations with augmented matrix ( A\b) = (a. y -|j3,.) and 

n 

(R\s) = (p if -\o { ) is its reduced echelon matrix , then £ p.-Xj = a-(z = 1, 

n 1 ~ 1 

... ,m) is equivalent to £ oc-x j = 0. (/ = 1,. .. , m). 

/=i 

PROOF By Theorem 1.5 the matrix (R|s) is obtained from the 
matrix (A\b) by a finite sequence of elementary row operations. The 
result now follows directly from the above theorem. 

It is useful to look at this result in more detail. Suppose that in the 
reduced echelon matrix (i?|s), the leading elements appear in columns 
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/i, / 2 ,. .. ,/g, and that the remaining columns are / c+1 ,.. . ,/ w , then 




o 


e 


0 ~ a c+i 

where % 4- 1 < m. Then, either 

(i) a e + i = 1 and o x = ... = o Q = 0, in which case the system is 

not consistent \ ^ ^ ^ ^ q (4 ) 

(ii) a c+1 = 0 and the system is consistent, i.e. the system has a 

solution and this solution may now be written out explicitly by giving 
arbitrary values to x,- ,..., x s . 

Afi+i ] n 

If £ = 72 , then the system has the unique solution x x = o v ..., 
x„ = o„ and if £ < n, there is an infinite number of solutions. 

Thus, the method not only decides whether or not a system has a 
solution, but if a solution exists, it gives a practical method for 
obtaining that solution. As a result of the above we can now state 

COROLLARY 2 A system of linear equations is consistent if and 

t 

only if the reduced echelon matrix of its augmented matrix has no 
leading element in its last column. — c ^ 5 . 

COROLLARY 3 If m<n, then a homogeneous system of m linear 
equations in n variables has at least one non-trivial solution. If Si — n y 
the trivial solution is the unique solution. )„ 

COROLLARY 4 A system of n linear homogeneous equations in n 
variables has a non-trivial solution if and only if its reduced echelon 
matrix R i^I n , where I n is the nx n matrix with 1 in the (/, impositions 
(i = 1,.. . , n) with zeros elsewhere. 

PROOF If R = I n , then the system reduces to x { = ... = x n = 0, i.e. 
we have the trivial solution. 

If R =£ I n , then R has a row and column of zeros, i.e. £ < n and by 
Corollary 3, the system has a non-trivial solution. 
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EXAMPLES 

1. Solve completely the following system of linear equations 


5x 2 + 35x 3 — 24;t 4 = 1 

2x 1 J r X 2 — x 3 4- X 4 = 0 

3 x 1 + 2x 2 + 2x 3 — x 4 = 1 
5xi 4- 3 x 2 + x 3 =1 

The augmented matrix of this system is the matrix (2) of the last 
example in §1.2, which is row equivalent to the matrix (3). Thus, 
this system of linear equations is equivalent to the system 

jc i — 4x 3 = 26 

x 2 + lx 3 — —43 

x 4 = —9 


or 

x i — 4x 3 4- 26, x 2 = ~lx 3 — 43, x 4 = —9 
Put x 3 = X, then the general solution is 

(xi,x 2 , x 3i x 4 ) = (4X + 26, —7X — 43, X, —9) 
where X may be given arbitrary values. 

2. Solve completely the following system of linear equations , 

X\ — x 2 4- jc 3 = 0 
jc i 4- ;c 2 4- 2 x 3 = 0 
Xi 4- 2jc 2 — x 3 = 0 

The matrix of coefficients of this system is 

1 -i ij 

By carrying out the following elementary row operations, we obtain its 
reduced echelon matrix 


\\ 2 -1 


12 


/l 

-1 



/i 

-1 

i\ 

1 

1 

2 

r 2 -+r 2 - r x 

-> 

f° 

2 

1 

. 

2 

i/ 

r x 

0 

3 

-2 



'l r i + r 7 
-> 

3r 2 


1 o §\ 

0 1 j 

\o o — 5/ 


1 1 0 §\ 

0 1 \ 

\o 0 1 / 


r —► r — —r 
' 1 ' 1 2 ' 3 


r 2~+ r 7 


—f 
2 ’ 3 


-> 



0 

1 

0 



which implies by Corollary 4 that the trivial solution x { = x 2 = x 3 = 0 
is the unique solution. 

3. For what value of k will the system 

2jci + x 2 = 5 


Xi — 3 x 2 = —1 


3x x 4- 4 x 2 = k 


be consistent? 

For that value of k, find the complete solution 


2 

1 

3 



/l 

2 

\3 





0 

1 

0 


r 2 - 2 r x 


> 


i 

0 


3 1 \0 

2 \ 

1 

k- 10/ 


-3 

7 

13 


\ 

7 

k + 3 / 


-> 

r 3 ^r 3 - 13 r 3 

r, - '■i + 


13 





































Thus, by Corollary 2, the system of linear equations is consistent if 
and only if A: — 10 = 0, i.e. k = 10. If k = 10, the reduced echelon 
matrix is 



0 

1 

0 



and the solution is x x = 2, x 2 = 1. 


Exercises 1.3 


1. Solve the homogeneous systems of linear equations with the 
following matrices of coefficients 


(i) /2 -1 1 0 1 

1 2-1 2 1 

\l -8 5-6-1 


(ii) / 1 -1 0 2 3 

5 2 1 0 1 

\-l 12-34 


(iii) /1 0—1 1 

2 1-1 1 

-12 0 1 

\ 1 0 2-3 


(iv) Is -3 
1 0 

2 -3 

3 -6 

7 -3 


2 

-1 

5 

11 

0 



2. Solve the non-homogeneous systems of linear equations with the 
following augmented matrices 


( 0 / 2-101 
-1 12 1 

\ 1 0 2 2 



(ii) 

3 1 

-1 1 

2 \ 

5 


-1 0 

2 5 

3 

4/ 


l 2 1 

1 6 

2 


14 


(iii) / 2 1 -1 1 

5 4 2 4 

0 1 10 

-11 5 1 

(iv) / 2 0 1-1 

-10 2 3 

110-1 
\ 1 15 4 

3. Find the conditions which X and ju must satisfy for the following 
systems of linear equations to have (i) a unique solution, (ii) no 
solution, (iii) an infinite number of solutions 

(a) 2x + 3 y + z = 5 
3x — y + Xz = 2 

x 4- ly — 6z = ju 

(b) x 4- y — 4z = 0 
2x 4- 3y 4- z — 1 
4x + ly 4- Xz = fj. 

4. For what values of X have the equations 

x + y 4- z - a 
Xx + 2y 4- z = b 
X 2 x + 4^ + z = c 

a unique solution? In the exceptional cases, find the conditions to be 
satisfied by a, b, c in order that a solution may exist and find the general 
solution. 

5. Determine the values of X for which the following systems of 
equations are consistent and for those values of X find the complete 
solutions 

(i) 5x 4- 2y — z = 1 (ii) x 4- 2y 4- Xz = 0 

2x + 3y + 4z = 7 2x + 3y — 2z = X 

4x — 5>> + Xz = X —5 Xx+j> + X 2 z = 3 
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0 


(iii) x 4- 5y + 3 = 0 
5x + y — \ — 0 
x + 2y 4- \ = 0 

6. Find the complete solution of 
y 4- z + u + 2v 


(iv) 2 x + 3 y + z + t = 

x + 2y + z - t = 1 

3x + 5_y + 2z 4- 1 = a 

6 x + lOy + 4z + t = (X + 
system of equations 
2 



— x 4- 4y + 3z 4- 3w + 4v = 7 

2x 4- jy + 3z + 2w + 8z> = 3 

3x 4- y 4- 4z — u + 4z> = 0 

5x + 2y + 7z + lOv = 2 

7. Solve completely the system of linear equations 


Xi +x 2 = 0 

*i +*2 +*3 — 0 

*2 + *3 + *4 = 0 


*«-3 + *«-2 + * n -l 

X n-2 + X n-l + X n 


X n-1 + X n 


= 0 
= 0 
= 0 


when (i) n = 8 (ii) n = 9. 


1.4 Matrix Algebra 

DEFINITION 1.8 EQUALITY OF MATRICES If A = (a..) is an 
m x n matrix and B = ($..) is a p x q matrix then A = B if and only if 
m = p, n = q and a fj = 0 /y (/ = 1,.. ., m,j = 1,.. . , n). 

Zero Matrix The m x n matrix 0 = (a..) such that a.- - = 0 
(i = 1,. .., m ;/ = 1 ,,n) is called the zero matrix. 

Matrix Addition If A = (oq-) is an m x n matrix and B = {$.) is a 
P x q matrix then A + B is defined if and only if m = p, n = q and then 

A+B = (a. - + P if ) 

is the m x n matrix obtained by adding the corresponding elements in 
A and B. 
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Scalar Multiplication If A = (a,-.) is anm x n matrix and ql^K, then 
oA = (aa.y) 

is the m x n matrix obtained by multiplying each element of A by a; 
oA is called the scalar multiple of A by a. 


+ 


EXAMPLE 

1 2 5 

3 4 6 

1 3 4 

5 9 3 


'i :w; 


0 1 -1 
2 5-3 


' 1+0 
3 + 2 


-1 1 
3 5 


is not defined. 


If A = 


-2A = 


1 2 5 

3 4 6 

-2 -4 -10 


and a = — 2, then 


—6 -8 -12 


2 + 1 
4 + 5 


5 - 1 
6-3 


The definitions of matrix addition and scalar multiplication are the 
expected defmitions for matrices, but as motivation for matrix 
multiplication, we consider the following. 

Suppose that we have the following systems of linear equations, 

+ i = <*n*i + <* 12 * 2 + <* 13*3 
y 2 = <* 21*1 +a 22 x 2 + <* 23 x 3 


and 

*1 = ^11 z l + 012 z 2 


x 2 ~ 021 z l + 022 z 2 ( 6 ) 

*3 = 031 z \ + 032 z 2 
Then, substituting (6) in (5) gives 

y 1 = <*11 $11*1+012*2)+ <*12(021*1 +022*2) + <* 13 ( 031*1 + 032 * 2 ) 

= (<*li0n + <*12 021 + <*13031) z 1 + (<*11012 + <*12022 + <*13032>*2 
y 2 = <*2l(011*l+012*2) + <*2 3 (021*l+ 022*2) +<*22(031*1 + 032*2) 

~ (<*21011 + <*22021 + <*23031) ^ 1 + (<*21012 + <*22022 + <*23032)*2 
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/ V 

/ 0n 

012 \ 

ICXii 0C n a 13\ 

or in other words, if A = and B = 

021 

022 

\<*21 <*22 <*23/ 



\ 031 

032 J 


are the matrices of coefficients of the systems (5) and (6) respectively, 
then the matrix of coefficients of the resulting system, which we shall 
denote by AB, is 

/°hl 011 + a 12 021 + a 13 031 <*11 012 + a 12 022 + a 13 032\ 

\ a 21 011 "I" a 22 021 + <*23 031 <*21 012 + <*22 022 <*23 032 / 

That is, if >1 is a 2 x 3 matrix and B is a 3 x 2 matrix, then is a 
2x2 matrix. Note, for example, that the (1, l)-element in AB has been 
obtained by multiplying the elements of the first row of A with the 
corresponding elements in the first column of B and summing. 

We use this now to give the following definition of matrix 
multiplication. 

DEFINITION 1.9 If A is anm x n matrix and B is a p x q matrix 
then AB is defined if and only if n—p and then AB is the m x q matrix 

AB = 


that is, the (ijfelement of AB is obtained by multiplying the elements 
in the ith row of A with the corresponding elements in the jth column 
of B and summing. 



/<* a 

<*12 • • 

• a i« \ 


/ 0n 

012 • • 

• 0* \ 

i.e. if A = 

<*21 

22 • • 

* ^ n 

and B = 

021 

022 • • 

• 02 q 



0 c m2 * * 

* ®mn ) 



0«2 * * 

■u 


then 


/au0n + <*i 2 0 2 i + ... + <*i„ 0 „i<*n 0 i 2 + <*12 022 +•••+<*i„0„ 2 

AB = / 

<* 21011 + 0(22 021 + ••• +<* 2 „ 0 „ i <*21012 + <*22 022 + ••• +<* 2 „ 0„2 
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n 

The system of m linear equations in n variables 2 <*y Xj = 0, 
(/ = 1.m) can be expressed in matrix form as AX = b, where 



EXAMPLE 

2 5 

Let A = ' 


1 0 


1 2 -1 
is the 2 x 2 matrix 

<2.1+ 5.-1+ 3.2 2.0+ 5.1+ 3.1 

1.1 + 2.-1 + -1.2 1.0+ 2.1+-1.1 


and B=\— 1 1 IthenAi? is defined and 


; 


3 

-3 


8 

1 


BA is also defined and is the 3 x 3 matrix 

1.2+0.1 1.5+0.2 1.3 + 0.-1 

-1.2+ 1.1 -1.5+ 1.2 -1.3+ 1.-1| = 

2.2+ 1.1 2.5+ 1.2 2.3+ 1.-1 



We note that AB and BA are not even of the same shape and thus in 
general AB BA. Even if A and B are of the same shape, AB =£ BA , in 
general, for example 


if 




and 


and 



, then 

1 
1 


We have seen in the above example that one of the laws of algebra, 
namely the commutative rule of multiplication is not satisfied. In the 
remainder of this book, a great deal of our work will depend on the 
manipulation of matrices and it would be of interest whether or not 
there are any further restrictions on the normal manipulations allowed 
in algebra. The following theorem shows that there are not. 


THEOREM 1.10 If A = (a iy ), B = (0 l7 ), C = (y gj ) are m x n, s x t and 
p x q matrices respectively, then 
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(i) (Commutative Law of Addition) 

A 4- B = B 4- A, where both sides are defined if and only if m = 5, 
n — t. 

(ii) (Associative Law of Addition) 

(A 4- B) + C = A + (B 4- C) where both sides are defined if and 
only if m = s = p, n = t = q. 

(iii) A 4* 0 = A. 

(iv) A + (~A) = 0 . 

(v) (Associative Law of Multiplication) 

A(BQ = (AB)C where both sides are defined if and only if n = s, 
t = p. 

(vi) (Left and Right Distributive Laws) 

A(B 4- C) = AB 4- AC where both sides are defined if and only if 
n = s = p, t = q 

(B 4- CjA = BA + CA where both sides are defined if and only if 
s = p, t = q = m. 


PROOF We shall only prove (v), the remaining parts are proved using 
a similar method. 

AB is defined if and only if n = s and is an m x t matrix. 

(AB)C is defmed if and only if n = s, t = p and is an m x q matrix. 

BC is defmed if and only if t = p and is an s x q matrix. 

A(BC) is defined if and only if n = s, t = p and is an m x q matrix. 

Thus we see that (AB)C 2 LndA(BC) are defmed under the same 
conditions and have the same shape. Using the rules for matrix 
multiplication, we now further see that 
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But the reader should be further warned concerning manipulations 
of matrices, as illustrated by the following examples 

( o /i 1 W 1 2 \ = (° °\ 

\2 2/ \-l -2/ \0 0/ 

that is AB = 0, but neither A nor B is 0. 

(ii) AB = AC=fi> B = C. 

i.e. A(B — C) = 0 can be true with ^4^0 and B =£ C. 

Exercises 1.4 

1. Add and multiply the following matrices (if possible) 


/ 1 
-1 
\ 2 


3\ 

2 

3 / 


/-l 


\ 1 


0 

1 

1 


1\ 
2 
2 



2 

0 

1 


4 \ 

3 

3 / 





Also test the associative and distributive laws on these matrices. 

fa b\ /I 0 

2. If the matrix A = ( commutes with the matrix B = ( 

\c d) \0 0, 

show that b = c = 0. Hence show that if A commutes with every 2x2 
matrix, it has the form 


A = 


a 0 
0 a 


3. Use the fact that a square matrix X commutes with X 2 to show that if 


X 2 = 


/ 4 
0 


1 

4 


°\ 

1 


\0 0 4/ 
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/a b c\ 

then X is of the form! 0 a by Hence, find all the matrices X which 

\0 0 a) 

satisfy the above equation. 

/ 7 4\ 

4. If A = l ^ 1, prove that for positive integral n, 


A n = 


1 4- 6/7 4n\ 

—9n 1 — 6 n ) 

Verify that the result is true when n is a negative integer. 

5. Find the most general real nxn matrix which commutes with the 
matrix 


/o 0 ... 0 1 
0 0 ... 1 0 


0 1 ... 0 0 

1 0 ... 0 0 


1.5 Special Types of Matrices 

1. IDENTITY MATRIX 

1 0 0 
.010 

The nxn matrix 


L 


•°\ 
. 0 


is called the identity matrix , 


\ ‘ ’ 1 * / 

\o 0 . 01/ 

denoted by I n or 7, that is I = (a if ), where a u = 7, a.. = 0 (i =£/). 


= IrA 


It has the property that 

A 

for all nxn matrices A. 

2. DIAGONAL MATRIX 


The nxn matrix 


Ai o 

o x 2 


0 


is called a diagonal matrix and 


.0 0 ... X 


I 

TV 

10 


and denoted by diag (X u .. ., X n ). 

3. INVERSE MATRIX 

7/^1 = (a.y) is an n x n matrix then an nxn matrix B such that 
AB — BA — I n 

is called an inverse of the matrix A. 

LEMMA 1.11 The inverse of a matrix is unique. 

PROOF Let C also be an inverse of A , then 
AC = CA = I n 
and 

C = CI n = C(AB) = ( CA)B = I n B = B ■ 

The unique inverse of A is denoted by A~ l . 

A matrix A which has an inverse is called a non-singular or invertible 
matrix, and if it has no inverse, it is called a singular or non-invertible 

matrix. 

LEMMA 1.12 If A and B are invertible nxn matrices , then AB is 
invertible and ( AB)' 1 = B~ l A~ l . 

PROOF If A and B are invertible, then^M -1 = I n = A' 1 A and 

BET 1 = I n = B~ l B. Now 

C AB\B' l A - 1 ) = I n = (JBT'A-'XAB) 

and so AB is invertible and since this inverse is unique by Lemma 1.11, 
we have 

(. AB) _1 = B-'A' 1 I 

Later in this section a method of calculating the inverse of a matrix 
will be given and for determining when a matrix is non-singular. 

4. TRANSPOSE OF A MATRIX 

If A = (a-y) is an m x n matrix, the transpose of A is the n x m 
matrix denoted by A x obtained by interchanging the rows and columns 
of A , that is 

^ = (%) 

where j3. ; . = a fi (i = 1 1,.. ., m). 
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EXAMPLE 

IM= (2 °1 _ 3 ) then A ' 

The following will be useful later. 




LEMMA 1.13 If A and B are m x n and p x q matrices respectively , 
then 

(i) ( A 4- B) x — A x 4- B x , where both sides are defined if and only if 
m = p, n = q. 

(ii) ( AB) X = B X A X , where both sides are defined if and only if n—p. 


PROOF We shall prove (ii) only. 

AB is defined if and only if n — p and it is an m x q matrix. Thus, ( AB) X 
is defined if and only if n = p and ( AB) X is a qxm matrix. Similarly 
B X A X is defined under the same conditions. 

If A = (oc ij ) mXn ,B = (py)„ x then 


(ABY = ( f *, k t3 k ,) 1 

\ k=l / 


(T/y) qXm 
n 

where y. - = 2 a jk $ kV 

k =1 

If A T = 0 %')nXm’ Bt = (Pi/) q xn> then 

B f A x = ( t p/ k a' f ) 

\k=l I qXm 


( Z a jkPki) 

\k=l IqXm 

= ( ABf , as required 

COROLLARY If A u A 2 ,. . . , A k are k matrices such that A i + l has 
the same number of rows as A . has columns (i = 1 ,,k — 1), then 
(AiA 2 ...A k ) t = A t k A t k _ l ...A{. 

PROOF Use induction on k. I 

LEMMA 1.14 If A is an n x n matrix thenA x is invertible if and only 
if A is invertible. Further ( A f )~ l = {A~ l ) r . 
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PROOF If A is invertible then^ -1 = I n = transposing gives 

(A^Y A x = I n = ^4 f (^4 _1 ) f and so^ f is invertible. The proof of the 
converse is similar. The above also implies by Lemma 1.11 that m 

(Ay 1 = (A~y. I 


5. SYMMETRIC, SKEW-SYMMETRIC AND ORTHOGONAL 
MATRICES 

Ann x n matrix A is called 

(i) a symmetric matrix if A f = A 

(ii) a skew-symmetric matrix if A x = —A 

(iii) an orthogonal matrix if A f A = AA X — I n . 


EXAMPLES 

( a b c\ 

b d e is a symmetric matrix 
c e fl 


and 



a b ] 
0 c 



-c 0 

2 2 \ 
1 -2 
-2 1 / 


is a skew-symmetric matrix 


is an orthogonal matrix 


Remarks 

1. In a symmetric matrix, the elements are symmetric about the 
diagonal of the matrix. 

2. In a skew-symmetric matrix, the diagonal elements are all zero. 

3. If A is an orthogonal matrix, then A x is its inverse. 


Exercises 1.5 

1. If A is an n x n matrix, prove that 

(i) A X A is a symmetric matrix, 

(ii) A + A x is a symmetric matrix and A — A x is a skew-symmetric 
matrix 

(iii) A is the sum of a symmetric and a skew-symmetric matrix. 

2. If A and B are symmetric and skew-symmetric matrices respectively 
and AB = BA and A + B is non-singular, prove that the matrix 

(A + f?) -1 (A — B) is orthogonal. 

3. Show that, if / + SA is non-singular, where A is a symmetric matrix 
and S is a skew-symmetric matrix, then the matrix 
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L = (I—S4XI+&4)- 1 

is such that L f AL = A. Conversely, if L f AL = A, where A is symmetric 
and I + L and A are non-singular show that S = (I + L)" 1 (I — L)A 1 is 
skew-symmetric. 

4. If /la a 2 12 

^4(a) = I 0 1 a 

\0 0 1 

show that A(a) A(fi) = A (a 4 0). Hence, find the inverse of A(oc). 

Show that 

^4(3a) — 3^4 (2a) 4- 3A(a) = I 
and hence find a cubic equation satisfied by A (a). 

5. If A is a real matrix, express the sum of the diagonal elements of 
A f A in terms of elements of A. Hence, show that if A,B are real and 
symmetric and C is real and skew-symmetric then A 2 4 B 2 = Cf 2 
implies A = B = C = 0. Does this conclusion still hold if A is not 
necessarily symmetric? 

6 . Prove that every 2x2 matrix X such that X x AX = B , where 

A = ^ B = ^ ^ has one of the forms 

a \a l \ 

or 

a -Wj 

Find all the matrices X which satisfy the additional relation X x X = / 

1.6 Elementary Matrices 

DEFINITION 1.15 An n x n matrix is called an elementary matrix 
if it can be obtained by applying an elementary row operation to I n . 

Thus, elementary matrices are one of the following three types 
corresponding to the elementary row operations r i arr• & r ; -, 
r t r { 4 ccrj respectively: 


/ a KM 
\— 0 L K 1 / 


x 
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i 


M t {a) = i 


a 


(/=!,...,«) 


H a 


o 1 


1 o 




A H ( a ) = 


1 a 




1 

*he:e ever, ether non-diagonal element is zero. 

LEMMA lit: : If an m x n matrix B can be obtained from an m x n 

marniz A irr.h :ng an elementary row operation, then B is equal to 
the product of the corresponding m x m elementary matrix with A 
Le. if e is the elementary row operation ^ 1 ) 

B = = e ^ A X ^ ^ (*.\ 

ii Ever, elementary' matrix is invertible, the inverse of each 

ft ( X €V\I~C 

elementjr. matrix is elementary . 


27 













PROOF (i) and (ii) are proved by considering each separate type, 
(i) r- -> ar i then 

an . . . CL\ n 


M t (a) A = 


OlOLjl • • • °A>z 


or 


Ami • • • a 


a 


/a, 


mn. 


11 • • • <*1m \ / u ll • • * “1 m 


an ... ai 


Al • • • Am 


Am 1 * * * Amm 


and similarly 


°Ai aa in 


,Am1 • • • Amm 


«11 • • • OL 


\n 


H..A 


a 


A 


. . . a 


7M 


... a 


Ami • 
Ai 


in 

. I 

( *mn 

. . . a^ 


and A-j (a)A = 


a 


i + oa f i ■ ■ ■ a ,n + aa i 


jn 


v AmI 


... a 


mn 
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(ii) For this part verify that 


(M,.(a)r = (H^f 1 = H tj 

and (A tj (a)) -1 = A tj (-a) 

EXAMPLE 


/ i o i\ / i o l \ 


If A = 

2 

-1 

0 

r 2 

-+ r 7~ 

2'i 


0 


-1 - 

-2 


\i 

1 

0/ 





\l 


1 

0/ 


(i 

0 



1 

0 

°\ 


n 

0 

1\ 

then 

0 

-1 

-2 

= 

-2 

1 

0 


2 

-1 

0 


li 

1 

0/ 


0 

0 

1/ 


li 

1 

0/ 


Also 



0\ 
0 
1 / 


0 

1 

0 


°\ 

0 

1 / 


1 0 0 
0 1 0 
0 0 1 


We are now ready to prove the main result in this section. 


THEOREM 1.17 If an m x n matrix A is row equivalent to anm x n 
matrix B, then there exist a finite number of elementary matrices 
E u E 2) ..., E k such that 

B = E x E 2 . . .E k A 

PROOF In Theorem 1.5, we saw that B can be obtained from A by a 
finite, say sequence of elementary row operations. The proof is by 
induction on k. If k = 1, then by Lemma 1.16 (i)£ = E X A for some 
elementary matrix E x . 

If k > 1 * e assume that if a matrix B' can be obtained from A by k — 1 
elementary rc* operations, then there exist elementary matrices 
E 2 ,.... E k such that 

B! = E 2 ... E k A 

Now B is obtained from B' by applying one further elementary row 
operation and by Lemma 1.16 (i) 

B = E X B' 
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for some elementary row operation E x and thus 

B = E X (E 2 . . . E k A) = ^£2 . .. E k A 

as required. ^ 

There are two important corollaries to this theorem. 

COROLLARY 1 If R is the reduced echelon matrix of A, then there 
exist a finite number of elementary matrices E u . . . ,E k such that 

A = E x E 2 . . . E k R 

PROOF By Theorem 1.17 there exist elementary matrices El,..., E k 
such that 

R = E[E[ . . . E k A 

But, by lemma 1.16 (ii), each elementary matrix is non-singular and its 
inverse is also an elementary matrix. Thus 

A = E^E’~_\...ErR 

as required. 

COROLLARY 2 The following statements are equivalent. 

(i) the n x n matrix A is invertible. 

(ii) the homogeneous system AX = 0 of n linear equations in 
n variables x u . . . , x n has no non-trivial solution. 

(iii) A is a product of elementary matrices. 

PROOF (i) => (ii) If A is invertible, the system of n linear equations 
in n variables given by AX = 0 has no non-trivial solution, since then 

X = (A~ l A)X = A' 1 (AX) = A~ l ( 0) = 0 

(ii) =* (iii) By Corollary 4 to Theorem 1.7 the reduced echelon 
matrix of A is I n and by Corollary 1 above 

A = E l ...E k 

is a product of elementary matrices. 

(iii) => (i) If A = E X E 2 ... E k , where each E- is an elementary matrix, 
then since elementary matrices are invertible, then by Lemma 1.12 

A' 1 = E~ l ... E? 

and A is invertible. 

Hence in order to determine whether a matrix is invertible or not we 
note that if the reduced echelon matrix of A is R and 
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(i) if R = I n , the matrix A is invertible. 

(ii) if R contains a row of zeros, then A is not invertible. 

This corollary is important in that it gives an explicit method for 
calculating the inverse of an invertible matrix. If A is non-singular, then 
the reduced echelon matrix of A is the identity matrix. If E u ... ,E k 
are the elementary matrices which correspond to the elementary row 
operations which must be performed on A to give I n 


i.e. I n = E k ... E 2 E x A 
then A = E^ 1 E 2 l ... Ef 1 I n 
and A~ l — E k ... E 2 E X I n 

That is, A' 1 is determined by applying the same elementary row 
operations to I n as were used to obtain I n from A. 

EXAMPLES 

I 

1. Find the inverse of 3 

\0 

-1 2 
2 4 

1 -2 




r 2 - 3 r, 





/I 

-1 

2 

1 

0 

°\ 



0 

1 


0 

0 

1 

- r , + r 2 ^ 

—* 1 

—2 

r 2 ~* r *- 5r 2 





0/ 


\0 

5 

-2 

-3 

1 



/I 0 0 ’ 1 0 1 

0 1 -2 0 0 1 

\o 0 8 -3 1-5 


'l - r t + 2^3 



■3 -*• i r 3 


10 1\ 
-I i -i 
-1 i -1/ 


/1 0 0 

0 1 -2 
\o 0 1 
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r 


/ 1 0 

therefore A~ x — — J i 





\ 


2 




r 2 + r j 


'3-+'*- 'l 


r 3-+ r 3~ r 2 


0 

1 

0 


and thus ,4 is not invertible. 



r 2~+ r 2- r i 


r „ -> 4r 


3 ' 2 


'l - 'l + >*2 

r 3^ r 3~ r 2 




/I 

0 

\o 



r, — r. 




/i o o 
0 1 0 
\o 0 1 


and thus ,4 is invertible. 

Exercises 1.6 

1. Find inverses (if they exist) of the following matrices 

(i) /I 0 1\ (ii) /I 2 3\ (hi) /I 

1 3 

1 5 
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3 

-2 

-1 



mm 


(iv) 

( 3 

-1 

4 \ 

(V) 

1 ° 

-1 

2 

i\ 


5 

1 

- 3 


-4 

3 

-3 

5 



-1 

i/ 


1 

0 

0 

-1 






l- 1 

1 

0 

V 

(vi) 

1 

2 

-3 


(vii) 

/> 


0 


2 

4 

1 

3 


* 

-l 

1 H 


-1 

1 

1 

0 


V- 

i 0 2 


I" 2 

2 

-5 

-■/ 






where i = \/—T. 


1.7 Elementary Column Operations and Equivalent Matrices 

Most of the material in this chapter has been developed in terms of 
elementary row operations. It is clear that the definition of elementary 
row operations (Definition 1.2) and row equivalent matrices (Definition 
1.3) may be adapted in an obvious way to define elementary column 
operations on a matrix and column equivalent matrices. 

In Theorem 1.17, we saw that if R is the reduced echelon matrix of 
an m x n matrix A , then there exists an invertible matrix P such that 

PA = R 

Indeed, it is easy to calculated, since by Theorem 1.17, 

R = E x E 2 ... E k A 

-s.he:e E k _ v ... ,E X are the elementary matrices which correspond 
:: the sequence of elementary row operations which are carried out 
successively in order to obtain R from A. Thus, we have 

P = E Y E 2 ... E k 

= E 1 (E 2 (...E k _ l (E k IJ) ...) 

and the matrix P is obtained by applying to I m the same elementary row 
operations as were used to obtain R from A (c.f. the method given for 
determining the inverse of an invertible matrix in § 1.6). 
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EXAMPLE 


2 

1 

2 


-1 

2 

9 


0 

1 

4 


1 1 0 
-1 0 1 
-5 0 0 



''i r t - 2 r 2 
-> 

r 3 r 3 - 2 r 2 


o 

-5 

-2 

3 

1 

-2 

°\ 










1 

2 

1 

-1 

0 

1 

0 

--> 







/ 

r i ~+ —j r i 

\0 

5 

2 

-3 

0 

-2 

i/ 



0 1 | —I 

, 2 , - 
0 0 0 0 


-4 4 

0 1 
1 -4 


0\ 

0 

l/ 


r % *+ r. 


r \~* r i~ 


/l 

0 


4 

i 

I 

°\ 

0 

1 

1 -1 


4 

0 

\o 

0 

0 

0 

1 

-4 

1/ 


It is easily verified that 




/i o 4 4\ 

o i 4-4 
\o o o o / 


Theorem 1.5 and Theorem 1.17 have their equivalent statements in 
terms of elementary column operations. In particular, we can prove 

THEOREM 1.18 If an m x n matrix A is column equivalent to an 
m x n matrix B , then there exist a finite number of elementary 
matrices E x , E 2 ,..., E% such that 


B = AE X E 2 . . . E% 

Following the above argument, we see that there exists an invertible 
n x n matrix Q such that B = AQ. 

We apply this, in particular, to the reduced echelon matrix R of an 
m x n matrix A Suppose that the leading l’s appear in columns 
/i,/ 2 > • • • 9 J r , then by subtracting suitable multiples of these columns 
from succeeding columns and then the column interchanges c x +* Cj 
c 2 *+ C :,. . . , c r C: , we have that R is column equivalent to the matrix 

>i ' 'r 
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N = 


Kxr 

i 

c 

^ _ - 

X 

Jk. 

o 


®(m-r)X(n-r) 


where l rXr is the r x r identity matrix and 0 pX<? is the zero p X q 
matrix. In other words, we can state this as 

THEOREM 1.19 7/^4 is anmxn matrix then there exists an 

invertible m x m matrix P and an invertible nx n matrix Q such that 


PAQ = N 

where N is the matrix given above. 

As explained above, the matrices P and Q are easily calculated as 
illustrated in the following: 


EXAMPLE 



-1 0 1 

2 1 —1 I, we have seen in the preceding example 


that if P = 


PA = 



\o 0 0 


Now, we have 


0 10 0 
0 0 0 0 


0 1 -f 

0 0 1 
0 0 0 


0 


i 

0 

1 

4 

1 

0 

0 

o\ 



0 

1 

i 

-4 

0 

1 

0 

0 


C 3 “ ► 5 Cj 

C 4 C 4 J c \ 

0 

0 

0 

0 

0 

0 

1 

0 


C 3 C 3 T C 2 





0 

0 

0 

1/ 


C t ^C t + fc. 

11 

0 

0 

0 

1 

0 

-4 

-4\ 
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, then 


r 0 

andife= 0 0 
\0 0 

r 

PAQ = 0 

\0 



1 

0 


0 0 
1 0 
0 0 




This leads us to make the following 

DEFINITION 1.20 Two m x n matrices A and B are equivalent if 
there exist invertible matrices P and Q such that 


PAQ = B 

This is easily shown to be an equivalence relation on the set of m x n 
matrices over a field K , since any m x n matrix is clearly equivalent to 
itself; if PAQ = B then P~ l BQ~ l = A ; and if PAQ = B and = C 
then ( P'P)A(QQ ') = C. What we have proved above is that every m x n 
matrix A is equivalent to a matrix with the simple form TV, where r 
denotes the number of non-zero rows in the reduced echelon form of A 
N may be regarded as a canonical form of A under this equivalence 
relation, that is, a representative of the equivalence class which contains 
A under this equivalence relation. Later, in Chapter IV (see Corollary 
to Theorem 4.20), it will be seen that r is the rank of the matrix A. 

N is called the normal form of the matrix A. 


Exercise 1.7 


1. Find invertible matrices P and Q such that PAQ is in normal form 
for the following matrices A 


(0/i o -l 

2 3 1 


(iii) / 


00 




/' 

0 

-M 

(iv) 

3 

“I 

0 

1 

i\ 

2 

1 

2 


-2 

0 

2 

1 

-i 

-1 

3 

1 


1 

2 

0 

0 

i 

\-i 

0 

1 / 


\-l 

1 

-2 

-2 

0 / 
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2. Determine which of the following matrices are equivalent 


(i) 

1 

-1 

1 

2 \ 

(ii) 

1 ° 

1 

0 

- 2 \ 


0 

1 

2 

0 


0 

“I 

0 

2 


ll 

0 

3 

2 / 


lo 

2 

0 

-4/ 

(iii) 

/ 2 

-1 

0 


(iv) 

(~l 

1 

0 

| 

- 

1 

1 

" 2 



2 

0 

1 


1 1 

1 

3 

-4/ 



\ 1 

1 

2 - 


(v) 
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CHAPTER 2 


Determinants 


Let M n (K) denote the set of all n x n matrices over a field K. 


2.1 2x2 and 3x3 Determinants 

We shall first motivate our discussion of n x n determinants by 
considering 2x2 and 3x3 determinants. 

Let A = ( an “ 12 \ (AT), then define det :M 2 (K)-+K by 

\<*21 a 22/ 

det A — &22 <*l 2 <*21 ^ K 


The determinant function arises naturally in the solution of linear 
equations, for consider the system of two linear equations in two 
variables 


<*11*1 + <*12*2 — Pi 


<*21 *1 + <* 22*2 — @2 


then 


<*22 Pi <*12 02 

<*11 <*22 ~ <*12 <*21 


det 


Pi 

ip 2 


det A 



and 


*2 = 


<*21 Pi <*11 P2 

<*12 <*21 — <*11 <*22 


/<* 11 

\<*21 
det ,4 



The following properties of the determinant function are easily 
verified 



<*11 

<*21 + P 21 


<*12 \ 

<*22 + 022 / 



i.e. a n (a 22 + p 22 ) — a 12 (a 21 + 0 2i ) = (<*11 <*22 ~ <*12 <* 21 ) 

+ (<*n 022 “ <*12 021 )• In other words, if r x = (a„, a 12 ), r 2 = (a 21 , a 22 ), 
r *2 = (02i, P 22 ), then 
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det 


r 1 


= det det 

72/ V 2 


\ r 2 + >*2 

Similarly, the following are also easily verified 

(ii) det = <* det for all ocEK, 

(iii) det( ri ^ ar2 ) = det Q V for all a G K 

(iv) det = 0 

(«» det (J J) = 1 
(vii) det A* = det A. 

(i) and (ii) imply that the determinant function is linear on the rows 
of A, that is 

/>i + a r[\ (r x \ (r[ 

“"( f, )-U + «-U 

(vii) implies that statements (i)-(v) can also be stated for the columns of 
the matrix A , for example 

det (ci + a c( f c 2 ) = det (c b c 2 ) + a det (c[, c 2 ) 

Not all of the above results are independent as will be seen later 
when we consider the general case. The above properties are also useful 
in evaluating determinants, although it should be emphasized that for 
2x2 matrices the evaluation can be performed easily by applying the 
definition directly, as for example in 


(x 3 y 3 ) = xy3 ~ x3y = xyty 2 -* 2 )- 


det 1 3 

V* y 

This determinant may also be evaluated by applying the above rules, 


y 


y 


— xy det 


1 1 


y 


1 


0 


= xy det 


jc 

.2 „ 2 ' 


y 2 -x 2i 


by (ii) 


by (iii) 


= xy(y 2 -x 2 ) 
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We now consider the case of 3 x 3 determinants. 

( <*11 <*12 <*13\ 

<*21 <*22 <*23 G M 3 (K), then define det : M 3 ( K ) K 

<*31 <*32 <*33/ 

by 

det A = <* 11 <*22 <*33 <*12 <*23 <*31 “b <*13 <*21 <*32 <*11 <*23 <*32 

— OCi2 0(21 <*33 — <*13 <*22 <*31 

Although there are ways of memorizing this formula, it is not pleasant 
to use for the actual evaluation of a determinant. But again the 
properties (i)-(vii) suitably adapted, are valid in this case. 

In addition, if we let^4 f . be the 2 x 2 matrix obtained by deleting 
the fth row and /th column of A (/, / = 1,2,3) then 

(viii) det ,4 = a n det^ u — a 12 det^ 12 + a 13 deM 13 

— —c*2i det A2\ + <*22 det > 1 22 — <*23 det ^ 4 23 

= a 31 det A 31 — a 32 det A 32 + <*33 det A 33 

= a n det^4 u — a 2 i det^4 2 i + <*31 det A 31 

= —a 12 det A 12 + a 22 det ^ 22 — a 32 det A 32 

= a 13 det A 13 — a 23 det A 23 + a 33 det A 33 . 

These equations are verified by expanding the right hand side in each 
case, for example 


oc n detain — a 12 det^ 12 + a 13 det^4 13 


= an det 


f <*22 <*23 


<*21 <*23 


— a 12 det 

,<*32 <*33 / \ <*31 <*33 


+ a 13 det 


<*21 <*22 

,<*31 <*32 


— c*n(a 22 a 33 <*23 <*32) <*12 (<*21 <*33 <*23<*3i) + a 13 (a 21 a 32 

— a 22 a 31 ) 

= det A 

To illustrate the above, we evaluate three examples. 
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EXAMPLES 
1 . 2 


2 . 




/o -1 -r 


= det 


= —det 


= 5 



by subtracting 2 x 
row 2 from row 1 
and adding 3 x row 
2 to row 3 and 
using (iii) 

by (viii) 


1 1 1 


1 


0 


det x y z = det * 


y 


0 


z —x 


by (iii) 


,x 2 y 2 z 2 ‘ 


= det 1 


,x 2 y 1 — x 2 z 2 —x 2j 


y—x z —; 
y 2 — x 2 z 2 — x 2> 


by (viii) 


= (y — x)(z —x) det 


— (y — x)(z — x) det 


y + x z + xj 


, by (ii) 


1 


0 


^y 4- x z — y/ 
= (y — x)(z — x)(z —y) 


x -f z x x. 


3. y + z x x 

det z 4* x y y 3 | = det| x + y + z y y 

\x+.y z zV \x + y + z z z 

n x - 3 

= (x -4- y 4- z) det I 0 y — x y 6 — x 

\0 z—x z 3 —x 3 

/l y 2 4- yx + x 2 

= (x+y + z)(y-x)(z-x)det 

\\ z + zx + x 

2_l_l_ ,.2 


X ' 

.3_3 


= (x + y + z)(y — x)(z — x) det 


1 y H yx 4- x' 

,0 z 2 —y 2 + (z —y)xl 


= (x+y + z) 2 (y- x)(z - x)(z -y) 
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Exercises 2.1 

In the following exercises we write \A \ for det A 
1. Evaluate the following determinants 


2. Evaluate the following determinants in factors 


(hi) 


b 2 + c 2 
c 2 + a 2 


(b~c) 2 

(c-af 


(iv) 


a 2 + b 2 ( a—b ) 2 
(x — a) 2 (x + a) 2 
( y — a ) 2 (y + a ) 2 
(z-a) 2 


a 2 4- 2 be 
b 2 4- 2ca 
c 2 4- 2 ab 
2y 2 4- 2 z 2 
2 z 2 4- 2x 2 
2x 2 4- 2 y 2 


0) 1 

3 2 



(ii) 

2 0 

-1 


(iii) 


4 5 



1 2 


2 






3 2 


4 


(iv) 

-2 

1 

0 

(v) 

2 

3 

1 



5 0 

1 


1 

0 

1 



0 

2 

2 


3 

3 

2 



2 

-1 

-1 


(0 

a 

b + c 

a 2 

(ii) 

a 2 

(b + c) 2 

be 


b 

c + a 

b 2 


b 2 

(c + a) 2 

ca 


c 

a + b 

c 2 


c 2 

(a + b) 2 

ab 


1 

1 

2 


(z + a) 2 

(v) a(b + c) {b 4- c) 2 a 2 — 2be 

b(c 4- a) (c + a) 2 b 2 — 2ca 

c(a 4- b) ( a 4- b) 2 c 2 — 2ab 

3. Find the general value of Q which satisfies the equation 
1 4- sin 2 6 cos 2 0 4 sin 2 0 

sin 2 6 1 4- cos 2 6 4 sin 2 6 =0 

sin 2 6 cos 2 6 1 + 4 sin 2 6 
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4. Solve the equation 


a — x b — x c 
a —x c b —x 
a b — x c — x 


= 0 


2.2 n x n Determinants 


In the previous section, we saw that even for the case n = 3, the explicit 
expression for the determinant of a matrix is very complicated. It was 
shown that in the evaluation of determinants the explicit form was not 
important, for the expansion of determinants the properties of 
determinants were exploited. For higher values of n , we seek a function 
which has these desirable properties. In this section, we show that such 
a function exists and is unique. Our main result will be to prove 


THEOREM 2.1 If A = (a f .) = (r u r 2 ,. . . , r n )GM n ( K ), where 

r i ~ ( a n» a /2> • • • ’ a in) 0* — 1 are the rows of A, then there 
exists a function D : M n ( K) K such that 

(i) D is linear on the rows of A, i.e. for i = 1, 2,. .. , n 


D(r u r\, ...,r n ) = a D(r u ..., r t ,..., r„) 

+ D(r u — . r n ), 

(ii) if two adjacent rows of A are equal , i.e. r k — r k+l for some k 
such that k = 1, 2,. . . , n — 1, then D(A) — 0 , 

(iii) D(I n ) = 7, where I n is the identity n x n matrix. D is uniquely 
determined by the above properties (i), (ii) and (iii) and is called the 

determinant function. 

The reader should note that the r- are rows of the matrix A , which has 
been written in the format = (/q,. . . , r n ) for obvious typographical 
reasons. 


Before proving the theorem, we show that the additional properties 
of determinants which have already been verified for n = 2, 3 are 
consequences of the three defining properties (i), (ii) and (iii). 
Furthermore, we shall also denote the determinant function by det 
and the value by det A or the more traditional | A \ or |a f/ .| depending 
on what is most convenient in the context. 


PROPOSITION 2.2 For j = 1, 2,. .. , n -1, if 
B — (ri,. . . , /y + 1 , r.,. . . , r n ), then det B = —det A, i.e. if two adjacent 
rows of A are interchanged , the value of the determinant is changed by a 
factor of—1. 
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PROOF 


by (ii) 


0 — det (/*!,. . . , 4* r j + 1 > r j r j + i> * * • 9 r n) 

— det (r u . .. , Vj^Vj). . ., r^f) 

+ det(r 1 ,...,r / ,r / + 1 ,...,r n ) 

4 det (a*i, .. ., Ty _j_ i, ry,. .. , 

+ det (/*!, . . . , Ty + j, /y + 1 ,... ,r n ), 

by repeated application of (i) 

= 0 4- det A 4- det B 4- 0, by (ii), 

that is 

det B = — det A ^ 

PROPOSITION 2.3 7/fwo rows in A are equal , f/zezz 
deM = 0. 

PROOF Suppose that the zth and fth rows of ,4 are equal. Interchange 
the ;th row with the succeeding rows adjacent to it until it is adjacent 
to the zth row. By Proposition 2.2, the sign of the determinant changes 
at every interchange, but now we have a determinant with adjacent - 
rows equal and by Theorem 2.1 (ii), det ,4=0. ■ 

PROPOSITION 2.4 Ifj =£ z, then 

det (r !,..., r f 4- ar- ,.. . , r ; .,. .. , r n ) = det (r x , . . . , r t ,..., r„), 
z.e. if a scalar multiple of one row is added to another row then the 
value of the determinant is unaltered . 

PROOF det (r l9 ..., r t + ar f ,..., r fi ..., r n ) 

= det(rj,... ,r z ,... ,r y ,... ,r„) 

+ a det (r l5 ..., r j9 ..., r j9 ..., r w ) by Theorem 2.1(i) 

= det (r x ,.. ., r t ,..., r n ) by Proposition 2.3 1 

In order to prove the theorem we need the following 

DEFINITION 2.5 If A = (a iy ) eM n (K), let A ij GM n _ 1 (K) be the 

(n — 1) x (n — 1) matrix obtained by deleting the ith row and jth 
column of A. A fj is called the minor of the element a-. The cofactor 
of a. in A is 

j ij 

C.. = (-lY+UetA^GK 

EXAMPLE 

(l -1 1\ 

lfA= 2 0 -11, then 

\l 1 2/ 
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cn = (—l) 3 det 


'2 -1 


1 2 , 

1 -1 
1 1 

We note that the sign (—1) ? +/ 


c 2 3 = (—l) 5 det 


= -5 

= -2 

is given by the pattern 


/+ - + - 

— 4 — + 

+ - 4 - 




We are now in a position to give the 


PROOF OF THEOREM 2.1 The proof is divided into two parts. We 
first prove the existence of the function D and then show that it is 
unique. 

EXISTENCE For a fixed/= 1, 2,__ w, define D : M n (K) -* K by 


D(.4) = 2 a if c tj 

i = 1 

The proof is by induction on n, that is, we assume that we have been 
able to define determinants for all integers < n. 

If n = 1, then the determinant function exists, i.e. D(a)*= a,aEK 
(we have also verified this for n = 2, 3). 

We now show that D satisfies the three properties (i), (ii) and (iii) 
in the statement of the theorem. 

(i) Consider D as a function of the kth row of A, and consider for 
a fixed 1 < z < n, the term 

a ij c ij = (“l) i+ S/ det ( A ij) 

If z =£ k , does not depend on the A:th row and by induction, since A ij - 

contains elements from the kth row of A, det A { - depends linearly on 
the kth row of A. If z = k , a- y . depends linearly on the ^th row of A and 
det A t j does not depend on the ^th row of A. 

Thus, each term in D(,4) depends linearly on the fcth row of A and so 
also does D(,4). 
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(ii) Suppose two adjacent rows of ,4 are equal, say r k = r k+1 for 
some k- 1,2,.. ., n — 1. Let i =£ k, k + 1 , then has two adjacent 
rows equal and by the induction hypothesis c i} - = (— l) z +/ det(^ /y ) = 0 . 
Thus, we have 

deM = a fc/ (-iy +fc det(^ / ) + a fc+ 1 )/ .(-l) /+fc+ 1 det(^ fc+li/ ) 

But A kj = A k+l j and so det (A kj ) = det (A k+ u/ ) and a kj = a k+1/ -, thus 
we have 

det A = afc/ .(-iy +fc det(^ fc/ )(l-l) = 0. 

(iii) If A = I n , the only term which gives a non-zero contribution in 
D(4) is 

(-l)' +i 'a„. det (A u ) = 1 det 4 ^ = 1 
by the induction hypothesis, i.e. 

D(4) = 1 

UNIQUENESS Let D, D': M n (A') -> K satisfy the three conditions 
(i), (ii) and (iii), then we show that 

D(r i,r 2 .r„) = D '(r u r 2 ,... ,r n ) 

for all A = (r u r 2 . r n ) G M n (K). lfA= ( r u r 2 ,..., r n ) and 

A(r,, r 2 ,..., r n ) = D(r 1; r 2 ,..., r n )) - D'(r 1( r 2 ,.. ., r n ), 

then we must show that 

A(ri, r 2 ,...,r n ) = 0 

It is easily verified that 

(a) A is linear in the rows of A , 

(b) if two rows of A are equal then A^, r 2 ,... ,r n ) = 0, 

(c) if (/i,/ 2 , ...,/„) is a rearrangement of ( 1 , 2 ,. . ., n) then 

A ( r /,’ r / 3 ’- = ± A(r l 5 r 2 ,... ,r„) 

(d) A(4) = 0. 

Now, let 

e 2 = ( 1 , 0 ,..., 0 ), e 2 = ( 0 , 1 , 0 ,..., 0 ),..., 
e n = ( 0 , 0 ,..., 0 , 1 ), 

then for i = 1 , 2 ,... , n 

n 

— (c^i, * * * 5 — 2 

7=1 
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Thus, we have by repeatedly applying (a) above that 


A (r u r 29 .. . ,r n ) 

~~ A ( 2 ^i/c • • • > 2 e k ) 

'k= 1 fc=l / 

n 

2 “i*, A(e fc , e fc ..., O 

where the sum ranges over the ti" terms obtained by allowing 
k u k 2 ,..., k n to take values between 1 and n. 

But, by (b), if any of the e k 's are equal, A(e k , e k ,..., e k ) = 0. Thus 
it follows that 

A (r u r 2 , 


2 

(fc,,fc 2 ,..., /c n ) 


a i*> a 2 k 2 ■ ■ ■ a nk n A(e v e ki ,e k ) 


where the sum ranges over all rearrangements (k u k 2i . .. , k n ) of 
(1,2,. . . , n). By (c) and (d), it now follows that if (k i9 k 2 , . . . , k n ) is 
a rearrangement of ( 1 , 2 ,.. . , n) then 

A(e ki , e ki ,... , e k ) = ±A(e u e 2 . e n ) 

= 0 


and thus 

A (ri, r 2 ,. . . , r n ) = 0 as required I 

Remark 1 By a similar argument to the above, it can be shown that 


D(t*i> 7*2,..., r n ) - 2 a lk ^ a 2k ^... a nk ^ D(e^, e^,.. ., t k ) 

= 2 sgn (k u k 2 ,..., k n ) 0L lki ^ 

where the sum ranges over the n\ rearrangements (k h k 2) . .. , k n ) of 
( 1 , 2 ,. . . , 72) and sgn (^ l5 k 2 ,. . ., k n ) = (— l) p , where p is the number 
of row interchanges required to obtain the identity matrix I n from the 
matrix (e^, e^,. . ., e k ). This formula is often given as the definition 

of the determinant of a matrix; the reader should verify this formula in 
the cases 72 = 2, 3. 

Remark 2 From the definition of D, it follows that if a matrix A has 
a column of zeros then D(/l) = 0. Furthermore, if A has a row of zeros, 
it can be proved that D(,4) = 0 (see Exercise 2.2, No. 9). 
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Remark 3 The uniqueness of the determinant function implies that 
the same value for Df4) is obtained whatever value for/' such that 
!</<« has been used in the definition of D(4). We call this the 
expansion of the determinant of A in terms of the /th column of A . 

Similarly, for each 1 < / < n 

n 

DC 4 ) — &jjCij 

gives the expansion of the determinant in terms of the /th row of A 
(see p. 56 for a proof of this). Thus, a determinant is expanded in 
terms of row or column which is most appropriate as illustrated in the 
examples below. 


EXAMPLES 


-1 

0 

0 

3 


-i 

0 

0 

3 

1 

3 

4 

1 


0 

3 

4 

4 

2 

2 

1 

5 


0 

2 

1 

11 

i 0 

1 

2 

7 


0 

1 

2 

7 


3 4 4 

2 1 11 

1 2 7 


0 

-2 - 

-17 

0 

-3 

-3 

1 

2 

7 

-2 

-17 


-3 

-3 



Using Proposition 2.4 


Using the definition 
of determinants 


= 45 


1 

X 

y 

1 


i 

X 

y 

l 

1 

X 

X 

X 


0 

0 

x-y 

X — 1 

X 

1 

xy 

y 


0 

1 -x 2 

0 

y-x 

X 

X 

xy 

l 


0 

X —X 2 

0 

1 —X 
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0 x—y x 1 
l —x 2 o y — x 
x-x 2 0 1—x 


~(x~y) 


1 —x 2 y-x 
x — x 2 l—x 


-(*-jO(i ~x) 


y-x 


(x T)(l ~x) 


i y 

X 1 


= (x- l)(x-j)(l xy ) 


3. 

Let 

1 

1 

. i 



*1 

X 2 . . 

• x n 


D„ = 

Y 2 

X\ 

x\ . . 

x 2 

. A. n 



y.n-1 

Xi 

y.n-1 

x 2 • • 

=v • 
1 

h-* 


then we prove that 

D _ = n 


n 


11 (x • — X-) 
n>i>j>\ 1 1 


The proof is by induction on n , the result being clearly true when 
77 — 2, i.e. 


D 2 


1 1 
X 2 


x 2 — x x 


If r { (/ = 1,. .. , ri) denotes the /th row of D n , then carrying out 

successively the following elementary row operations 

r n ■* r n-Xir n _ v /•„_! r n _ 1 - x, r n _ 2 ,..., r 2 ■* r 2 -x t r t we obtain 
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1 


1 


. . 1 


0 x 2 — Xi . . . x n — Xi 

= 0 x 2 (x 2 -Xi) . . . x n (x n -x l ) 

0 . . . x"' 2 (x n -x l ) 

= (x 2 -x 1 )(x 3 -x 1 ) . . . (x„-x,)D n _ l 


The proof is now completed by induction on n. D n is called the 
Vandermonde Determinant. 


4. 

Let 



2 1 0 0 . . . 0 

1 2 1 0 . . . 0 

0 1 2 1 . . . 0 

0 0 0 0 . . 1 2 

“ ®n-2 


Thus we have 


^ n Prc— l ^n— l ^n-2 Prc—2 ^n- 3 

from which we infer that 

= D 2 -Dj = 3-2 = 1 

Similarly, we have 

^i-l ~^n-2 ~ ^ 


D 2 -D! = 1 

and adding gives 

= (n — 1) + D x = n + 1 

Exercises 2.2 

1. Evaluate the determinants 
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0 ) 



0 

-1 

2 

1 

(ii) 

2 

1 

0 

1 

-4 

3 

-3 

5 


-1 

2 

1 

0 

1 

0 

0 

-1 


1 

2 

5 

3 

-1 

1 

0 

1 


0 

-1 

1 

1 


2-111 
5 6 12 

1-211 
3-211 


2 . 


For what values of x is the determinant 
x a b c 
a x b c 


a b x c 
a b c x 


= 0 ? 


3. Show that the value of the determinant 


0 111 
1 0 a + b a + c 

1 b 4- a 0 b + c 
1 c + a c + b 0 


—4 (ab + be 4- ca ) 


and hence find the values of the determinant when 

(i) a , b and c are the distinct cube roots of unity. 

(ii) a , b and c are distinct fourth roots of unity. 


4. Show that the following /r-rowed determinants 


0 ) 


1 1 1 ... 1 1 

-1 1 1 ... 1 1 

0-1 1 ... 1 1 
i ! ! ii 

0 0 0 ... -1 1 


= 2 


n -1 
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(iii) 


(iv) 


X 1 

1 . . 

1 1 




1 X 

0 . . 

o 

o 




1 1 

x . . 

0 0 

= (* 

— 1 ) {(x — \) n 2 + x n 

1 1 

1 . . 

1 X 




1 +X 2 

X 

0 

. . 

0 


X 

1 +x 2 

X 

. . . 

0 


0 

X 

1 + x 2 . . . 

0 

= 1 +x 2 + ... + x 

0 

0 

0 

X 

1 +x 2 


a + b 

a 

a . . . 

a 




In 


a 


a 


a + b a 


a 


a 


a 


a + b 


= b n ~ l (na + b) 


5. \{ A = (a,- ; ) is an n x n matrix with an ¥= 0 and also a.- =£ 0 when 
/ + /> n + 1 , the remaining elements all being zero, find an expression 
for det A. 


6 . \^\A n be the nxn matrix 


/■ 


-2 

1 

0 


4 

-2 

1 


0 0 . 

4 0 . 

-2 4 . 


0 0 0 0 

0 0 0 0 . 



Show that det A n = 0 when n = 3k — 1 and find det A n when n = 3k. 


52 


7. Let D : M n { R) -> R be such that 

D(A)D(B) = D(AB) 

for a\\A,BGM n (R). Show that either D(/4) = 0 for all ,4 E M n (R) or 
D(/^) = 1. In the latter case, show that D(^4) =£ 0 whenever^ is 
invertible. 

If n = 2, suppose further that D Cl o) * D (o P rove Ihe 
following 

0) D(0) = 0, 

(ii) D(A) = 0 if A 2 = 0, 

(iii) D(£) = — D(yl) if B is obtained by interchanging the rows 
(or columns) of A. 

8 . If A is an n x n matrix which has a row of zeros prove that 
D(A) = 0. 

9. Let ,4^ be the nx n matrix whose (/, /) th element is | / — /1. Show 
that &e\A n — (—1 ) n ~ l (n — \)2 n ~ 2 . 

2.3 Further Properties of Determinants 

In this section, we prove that the determinant function is multiplicative 
and deduce some important consequences of this result. The proof 
involves the notion of elementary matrices introduced in Chapter 1. 

We first prove the following lemma 

LEMMA 2.6 If E is an elementary matrix , then 

(i) det E 0 

(ii) det E f = det E 

(iii) E~ l is an elementary matrix. 

PROOF The proof is by a case-by-case evaluation of the determinant 
of the three different types of elementary matrices 

det M f (a) = det M i (a) t = a (/ = 1,..., n) 

det Hy = det Hjj = -1 1 , n ) 

det A i j{a) = deM, 7 (a) f = 1 (/,/ = l,... ,n) 

proving (i) and (ii). To prove (iii), we note that 

Mfioty 1 = M,(a v ) = H if A^af 1 = A^-a) I 

We now prove our main theorem. 
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THEOREM 2.7 If E is an elementary nx n matrix, then 
det £4 = (det jF) (det ,4) 
for all n x n matrices A. 

PROOF The proof is again by a case-by-case evaluation. 
Premultiplying a matrix ,4 by AT-(a) is equivalent to multiplying the zth 
row of A by a, thus 

det (Mj (ol) A) = oc det A by Theorem 2.1 (i) 

= (det M^a)) (det A) by Lemma 2.6 

Premultiplying a matrix A by H i] is equivalent to interchanging the zth 
and jth row of A, which by repeated application of Proposition 2.2 
gives 

dttQI^A) = —det A = (det H i -) (det A) 

Similarly, premultiplying a matrix A by A^ipc) is equivalent to adding 
a times the jth row of A to the zth row which by Proposition 2.4 and 
Lemma 2.6 gives 

det(A ii ( 0 L)A) = det ,4 = (det (A^-iaj) (det A) I 

Our main results are now obtained as corollaries to this theorem. 

COROLLARY 1 A matrix A is non-singular if and only if det A ¥=> 0. 

PROOF If ,4 is non-singular, then by Theorem 1.17 Corollary 2, 
there exist elementary matrices^,. .. f E r such that 

E x E 2 . . .E r A = I n 

Thus, by the above theorem 

(det (det ^ 2 ) . . . (det/T r )(deM) = det I n = 1 

and so det A 0 as required. 

Conversely, \{A is singular, then by Theorem 1.17 there exist 
elementary matrices E u ... ,E r such that 

E X E 2 . . .E r A = R 

where R is a reduced echelon n x n matrix which contains a row, and 
hence a column of zeros. Thus, det R = 0 and since by Lemma 2.6 (i), _ 
det E t ^ 0 (z = it follows that det A = 0. I 
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COROLLARY 2 If A and B are nx n matrices, then 
det AB = (det A) (det B) 

PROOF If A and B are non-singular matrices, then by Theorem 1.17, 
there exist elementary matrices E x ,. . . ,E r ,E [,... ,E' such that 

A=E x E 2 ...E r , B=E[E^..E; 

Then, by repeated application of Theorem 2.7, we have 

det AB = det (E l ... E r E[ ...£') 

= (det E x ) .. . (dptE r )(detE[) ... (det^) 

= (det A) (det B) 

\f A is singular, then by Corollary 1, det ^4 =0 and by Theorem 1.17 
there exist elementary matrices E u . .. ,E r such that 

A = E x ...E r R 

where R contains a row of zeros. Then we have 
det (AB) = (det E{) . .. (det E r ) (det RB) 

= 0 

since RB also contains a row of zeros. 

Thus again 

(det AB) = (det A) (det B) = 0 
A similar argument is used if B is singular. 

COROLLARY 3 The system of n homogeneous equations in n 

n 

variables 2 a -• x - = 0 (i= 1,..., n) has a non-trivial solution if and 

7=1 

only if det A =0, where A = (a if ). ^ %. 7 

PROOF This follows immediately from Corollary y and Corollary 2 
to Theorem 1.17. k\ 

COROLLARY 4 If A G M n (K), then 

det A x — det A 

PROOF If A is non-singular v then by Corollary 2 to Theorem 1.17 
A = Ei... E r 
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where E (i= 1,..., r) are elementary matrices. Then using Lemma 2.6 
and Theorem 2.7 we have 

det^l* = (det E*) ... (det E[) 

= (dQtE l )...(detE r ) 

= det A 

If A is singular, then by Corollary 1, det A = 0 and 
A=E 1 ...E r R 

where E u .. ., E r are elementary matrices and R is the reduced echelon 
matrix of A. Since R f has a row and column of zeros, it follows that 
det R f = 0 and thus, we have 

det^4* = 0 


that is, 

det A x = det A (= 0) 
in this case also. 

This last corollary is important in that it shows that the defining 
properties of a determinant stated in terms of the rows of a matrix A 
are also true for the columns of A, i.e. 

(i) det is linear in the columns of A 

(ii) if two adjacent columns of A are equal then det A = 0. 
Furthermore, Propositions 2.2, 2.3 and 2.4 can now also be stated for 
the columns of A and also 


n 

2 a,y c , 7 = det A (/ = 1,..., n) 

/=i 

Of course these statements could have been proved directly for columns 
only minor modifications of the appropriate proofs being essential. 


Exercises 2.3 

1. Determine which of the following matrices are non-singular 


(0 


/ 2 


1 


1 -1 


3 \ 

2 


\4 


5 -2/ 



1 1 
0 


(hi) 


0 

1 

-1 
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3 


2 \ 

1 

1 


2. If s r = a K 4- 4- by expressing the determinant as the product 

of two determinants show that 



3 

Sl 

*2 




Sl 

S2 

■^3 

= 0 

- Pf (a 


S2 

Sl 

s 4 



Evaluate the determinant 



3 

Sl 

S 3 




Sl 

S2 

s 4 




S2 

Sl 

s 5 



3. If5 

r ~ 

a r + 0 

r 4- 

7 r , show that 


S r 

S r+ 1 

S r+2 



5 r+1 S r+2 


V+2 


V+ 3 


V+3 

V+4 
.3 


a P 7 (p — y)(7 -a) (a-P)' 


If a, j3, 7 are roots of x 3 + bx + c, evaluate this expression in terms of 
b and c. 

4. Prove that the product of the matrices 

tf! + iZ?i Ci + idA / a 2 + ib 2 c 2 + id 2 

-(cj-idi) ax-ibj \-(c 2 -id 2 ) a 2 -\b 2 

is another matrix of the same form, where a u b u c u d u a 2 , b 2 , c 2 , d 2 
are integers. By evaluating the determinants of these matrices prove 
that the product of a sum of squares of four integers with a sum of 
squares of four integers is the sum of squares of four integers. 

5. Let A be the matrix 


fa + bx + cx 2 

a + by + cy 2 

a + bz+ cz 2 

c 4ax4 bx 2 

c + ay 4 by 2 

c 4 az 4 bz 2 

\b + cx + ax 2 

b + cy 4- ay 2 

b 4- cz 4- az 2 


and let B be the matrix 
1 1 1 
x y 


y 
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Find a matrix C such that A = CB. 

Show that the determinant of^l is equal to 

a 4- b 4- c a 4- b 4- c a + Z? + c 

ax 4- cy 4- bz bx + ay + cz cx + by + az 

ax 2 4- cy 2 4- bz 2 bx 2 4- ay 2 4- cz 2 cx 2 4- by 2 4- az 2 

6 . Determine whether the following systems of homogeneous equations 
have non-trivial solutions 

(i) x + 2y — z = 0 (ii) x 4- 2 y — z - 0 
—2x 4- y 4- 2z = 0 — 2x 4- y 4- 2z = 0 

x — y + z = 0 x — y — z — 0 

7. Show that if i= 1 there is always a solution to the equations 

2 x — y — z — 6 
x — 2y 4- z = p 
x + ky — 2z = p 2 

whatever the values of p and for a Fixed p the solution is unique. Find 
the solution when k = 2, p = 1. Show that if k = 1 there are exactly 
two values of p for which the equations have solutions; find these 
values and solve the equations completely in each case. 

8 . For which values of k does the following system of linear equations 
have (i) no solution, (ii) a unique solution and (iii) more than one 
solution? 

x 4* (k + 2)y + 2z = 2 

0 k 2 4 1)x42 (it 4 2 )y 4- 4z = 4k 

3x 4- 9y 4- 3 (& + 1 ) z = 6k 

9. Solve Exercises 1.3, Nos. 3-7, with the methods now available. 

2.4 The Inverse of a Matrix 

In Chapter 1, a method was presented for inverting a matrix. We now 
give an alternative method which is not as efficient for practical 
purposes but is more useful theoretically. 

By the definition of determinants, we have for / = 1,2,..., n 
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n 


2 atfCff = deM 
i= 1 

Furthermore, for k =£/, we have 



U 11 

■ ■ «!/ 

• • «1 / • 


n 

OC 2 \ 

• • a 2/ 

• • «2/ • 

• Hn 

2 a ij c ik ~ 

. 

. 

. 

i = 1 

a nl ■ 

• • Hi ■ 

• • a m ■ 

. o: 

tin 


Combining these two, we obtain 

2 % c ik = 5 jk det A uIm. % ^ f (1) 

1=1 

DEFINITION 2.8 If A G M n ( K ) then the adjoint of A, denoted by 
adj A is defined by 


adj A = (c,/ 

z.e. f/ze n x « matrix obtained by placing c- { in the (z, j’fposit ion. 
Then, using ( 1 ), we have 
(*d iA)A - (f„)(«„) 



= (det^)/„ 
Thus, if det ^4=^0, then 

det >1 " 


or in other words, since the inverse of a matrix is unique, 

i = adj _A 
det A 
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EXAMPLE Let A = 



then det A =4 and 


adj,4 = 


/ 

1 

-1 


0 

-1 

+ 

0 

-1 

+ 

— 

1 

-1 

2 

5 


2 

5 



2 

-1 

+ 

1 

-1 


1 

-1 


1 

5 

1 

5 


2 

-1 


2 

1 


1 

0 

4- 

1 

0 

1 

2 


1 

2 

2 

1 


\ 


Therefore 



-2 

6 

-2 


i\ 




As mentioned earlier, this is not an efficient method for calculating 
the inverse for large n, it involves the evaluation of one n x n determi¬ 
nant and n 2 (n — 1 ) x (n — 1 ) determinants. 

This may be applied to the solution of linear equations. Consider 
the n linear equations in n variables 


2 a tj Xj = 

i =i 




or in matrix notation 
AX = b 
where 

A = (a,. / ),A' t = (x !,.. •, x n ), b 
If A is invertible, then 

X — A~ l b = ^ b 

det A 

or for i = 1 , 2 ,..., n 
1 


t _ 


(Pi, • • • > P n ) 


n 


XI = 




_ det Aj 
det A 
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where A t is the matrix obtained from A by replacing its /th column 
with the column b, i.e. 

i 


j a ll • • • Pi • • • a ln 

<*21 • • • P 2 • • • a 2 n 



This is the well known Cramer-solution of linear equations. 
EXAMPLE 


x — z = 1 
2x + y — z = 1 
x 4- 2y + 5z = 2 

Using the inverse of the matrix A obtained in the above example we have 



or alternatively, Cramer’s solution gives 


x = -7 det 
4 




\2 


1 

z = — det 
4 


/! 

2 



/! 

y = -7 det 2 
^ 4 

\l 



1 

1 

2 



Exercises 2.4 

1. Use the methods of this section to find the inverses of the following 
matrices 


(i) 1-2 3 

6 0 
\ 4 1 



















































2. Use Cramer’s method to solve the following systems of linear 
equations 

(i) x- y + 2z = l (ii) 2x + 3y - 5z = 4 

2x + y + z = 2 x 4- 7y — 2z = 1 

x — 3y 4- z = 1 5* w 1 \y 4- 2z = — 2 

3. For what values of t will the following matrix be non-invertible? 
For all other values of t , what is the inverse? 



/! 

0 1 
\t 0 

4. If A is an n x n matrix, prove that 

det(adj>l) = (deM)"" 1 

5. Find the adjoint of A y where 


A = 



0 

x + 1 
1 



If 


B = 



, for what complex values of x is the 


product of adj A and B non-invertible? 
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CHAPTER 3 


Vector Spaces 


In this and the remaining chapters the following notation will be used: 

K is an arbitrary field 
R is the field of real numbers 
C is the field of complex numbers 
Q is the field of rational numbers. 

3.1 Introduction 

In physical connections, a vector is usually defined to be a “physical 
quantity which has magnitude and direction.” As the work of this 
chapter involves a generalization of a vector, we first attempt to define 
more precisely a vector in the plane and in space. 

We shall first consider the case of a plane. By a point in the plane we 
mean an ordered pair (< a,b ) of real numbers (i.e. each point is rep¬ 
resented by coordinates a and b relative to a rectangular coordinate 
system). The origin 0 is the point (0,0)._A directed line segment from 
the point P to the point Q , denoted by PQ , is a line with initial point P 
and end point Q, that is, a directed line segment is an ordered pair of 
points (see Figure 1). 

If P = (pi, p 2 ), Q = (q i, Q 2 ), R = (r u r 2 ), (s u ^ 2 ) are four points 
in the plane, the two directed line segments PQ and RS are said to be 
equivalent if 

(Q\-PuQ2-Pi) = (Si-r u Si-r 2 ) 

This relation is an equivalence relation on the set of all directed line 
segments in the plane, for it is clearly reflexive and symmetric and if 
also RS is equivalent to TU , where T = t 2 ), U = (u u u 2 ) are two 
further points in the plane, then 

(<h-Pu<l 2 -P' 2 ) = (Si-ri,s 2 -r 2 ) = (u t ~ t u u 2 - 1 2 ) 
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Figure 1 


and the relation is also transitive. Now, it is easily seen that every 
directed line segment is equivalent to a unique directed line segment^ 
with initial point_the origin 0\ i.e. if P = (p u p 2 ), Q = (q 1 , q*i) then PQ 
is equivalent to OR, where R = (q x — p u q 2 — Pt)- Thus, each equivalence 
class of directed line segments can be representedjby a directed line 
segment with initial point the origin O. A vector OP in the plane is 
defined to be the directed line segment with initial point O and end 
point P. A vector is completely determined by the co-ordinates of its 
end point, thus a vector OP will be denoted by (p u p 2 ), the co-ordinates 
of the point P Let K 2 (R) denote the set of all vectors in the plane, i.e. 

f 2 (R) = {Oi,p 2 )lPi,p 2 eR} 

There is a well known rule, called the parallelogram rule, for adding 
vectors. If OP and OQ are vectors, a vector OR corresponds to these 
two vectors in the following way: 
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Let QR be the greeted line segment with initial point Q which is 
equivalent to OP (i.e. QR is parallel to OP and with^the same length 
andPP is the directed line segment equivalent to OQ). Then OR is the 
sum of OP and OQ, i.e. 

OP + OQ = OR 


or 


(Pi,P2) + (<7i>< 72) = (Pi + <7 i,P2 + <?2) (!) 

The multiplication of a vectonOP by a real number X can be dealt with 
similarly, X OP is the vector OS of length | X | OP in the same direction as 
OP if X is positive and in the opposite direction if X is negative; this gives 

X(Pi,P 2 ) = (Api, Xp 2 ) (2) 

Thus, on F 2 (R) addition is defined by the rule (1) and scalar multipli¬ 
cation by elements of R by (2). 

Directed line segments and vectors in space (or 3-space) can be 
defined similarly. If we let L 3 (R) denote the set of all vectors in space, 
or 

K 3 (R) = {(a,0,7)l«,0,7€R} 
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then similar geometric conditions will give the rule for addition in 
K 3 (R) and scalar multiplication of elements of F 3 (R) by scalars in R, 
namely 

(a, jS, 7 ) + O', /?', 7 ') = 0 + <*', 0 + P', 7 + If') 
and 


X (a,i3,7) = (Xa, X0, X 7 ) 

With the pattern now developing, we can proceed to define V n (R) 
for n> 4 , except that for these higher values of n we do not have a 
geometrical interpretation of our “vectors”. We shall see in later sections 
of this chapter and book that the generalization proves to be a useful 
one. Indeed, rather than restricting ourselves to the real numbers only, 
we can work over any field X. So we give the following 

DEFINITION 3.1 If K is an arbitrary field , let 

V n (K) = {(oc li a 2i ... i oc n )\a i eK (/=!,...,«)} 

Define addition of elements in V n (X) by setting 

(<* 1 , a 2 ,..., %) + (p u J3 2 , • ■ •, P„) = (a, + Pu <*2 + P 2 , ■ ■ ■, a n + P„) 

and scalar multiplication of elements of V n (X) by elements of K by 

a(<*i, a 2 ,..., a n ) = (acc u aoc 2 ,. .. , oca n ) 

for all a, a i , fy (i= 1E X. 

Having given these definitions, we can show that addition and scalar 
multiplication satisfy many rules which allow us to manipulate these 
vectors according to the usual laws of algebra. These are expressed in 
the following theorem. 

THEOREM 3.2 If u = (a u . . ., oc n ), v = (j3j,. . . , 0„), w = ( 71 , • . • > 7 „) 
are arbitrary elements of V n (.K ) and a, f3 are arbitrary elements ofK then 
0 )u + v‘GV n (K) 

(ii) i u -f (y + w) = (u + v) + w 

(iii) u 4- v = v -h w 

(iv) r/zere a 0 E (X) swc/z that u + 0 = u for all uEV n ( K ) 

(v) for each u — (a^ ...., a n ) E V n ( K ) f/zere exists a 

—u = (— ol u . . . , — a n ) E (X) swc/z that u + (— u) = 0 

(vi) au e V„ (K) 

(vii) (a +13) w = aw + 

(viii) a(u + v) = ecu + av 
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(ix) a(P u) = (a(3)u 

(x) lw = u. 

PROOF The proof is elementary; we shall prove some parts, the 
reader should verify the remainder, for example 


u + (v + w) = (a,,..., a„) + {(Pi,... ,P n ) + (y h , y n )} 

= (ax, • • •, a„) + (0i + 7i. P„ + 7 n ) 

= (<*1 + (Pi + 7i), • • •, <*„ + (P„ + y„)) 

= ((c'i + Pi) + 7i,...,(a n + P n ) + y„) 

= (u + v) + W 

which proves (ii). The 0 E V n (X) required in (iv) is the /z-tuple 
0 = (0, 0,. . . , 0) and in (v) 

(a ,,... ,ct n ) + (—ai,..., — a n ) = (aj + (-aO,. .., a n + (-a n )) 

= (0.0) = 0 I 


We shall not develop further our study of J^(X) at this stage. Indeed 
just as F 2 (R) and F 3 (R) were used to motivate the definition of F^(X) 
for arbitrary n and arbitrary fields X, we shall use (i)-(x) in Theorem 3.2 
to motivate the definition of a general abstract structure, called a vector 
space of which V n (X) is merely an example, albeit a very important 
and typical example as will be seen before the end of this chapter. 


3.2 Definition and Examples of Vector Spaces 

DEFINITION 3.3 A set V is called a vector space over the field K or 
a K-space if 

(a) (i) V is closed with respect to a binary operation called 
addition (+), i.e. if u, v E V, then w + vE V, 

(ii) ( commutative axiom ) u + v = v + u for all u, v E V, 

(iii) {associative axiom) u -I- (v + w) = (w + v) 4- w for all 
u, v, w E V, 

(iv) there exists an element 0 E V, called the zero element , such 
that w + 0 = 0 + w = w for all mEF 

(v) for every v£ V, there exists an element (— v) E V such that 
v + (— v) - 0 = (— v) 4- v. 
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(b) (i) for every aEK, vE V an element av called the scalar 
multiple of v by a, is defined and avEV. 

(ii) a (u 4- v) = au 4- av, for all aEK, u, vEV, 


(iii) (a 4- P) v = av 4- pv, 

(iv) a (fiv) = (ap) v, 

(v) lv=v, 


for all a, (3 E K, vE V, 
for all a,pEK, vE V, 
for all vEV. 


EXAMPLES 


1. V n (K) = {(a u ... ) a n )\a i EK(i = 1, ...,*)} 
was shown to be a Af-space in the previous section. 


2. Let P(K) be the set of all polynomials in an indeterminate x with 
coefficients from K , i.e. 


P(K ) = {oq 4- ol x x 4- .. . + a m x m \a m + 0,0^- EK(i= 1,... ,m)} 

+ is the usual addition of polynomials and if a E K, then scalar multipli¬ 
cation by a is defined by 

a(a 0 + ol\X 4- ... 4- oc m x m ) = aao 4- aaqx + . .. 4- aa m x m 

Then P(K) is a K -space. To verify this all the axioms in definition 3.3 
must be shown to be satisfied. In general, the verification is elementary, 
as in for example, 3.3 (a)(ii), if f(x) = Oo + oqx + . .. -h a m x m and 
g(x) = p 0 + p x x + . . . + p n x n , assuming, without loss of generality, 
that m>n , then ^(x) = Po 4- P x x + .. . + p n x n 4*. .. + p m x m , where 

0 n + l=--- = 0m = O - Then 

/(*) + £(*) = («o + Po) + (<*i + Pi)x + ... + (a m + P m )x m 
= (Po + Oo) + (Pi + <*l)* + • • • + (P m + “ m )^ m 
= £■(*) + fix) 

3. Let P n (K) be the set of polynomials of degree < n with coefficients 
in K , i.e. 

P n (K) = {ao + aiX+ ... + a n x n \a j GK (/= 1,... 

If addition (4-) and scalar multiphcation by elements of K are defined as 
in Example 2, an elementary verification again shows that P n ( K ) is a 
/f-space. 
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4. Let X be an arbitrary set and K an arbitrary field. Let 

K x ={f:X->K} 

be the set of all mappings (functions) of X into K. If fgE K x , aEK, 
then addition and scalar multiplications are defined on K x by 

(f+g) (x) =f{x) +£(x) for all x G X 

(af) (x) = af{x) for all x E X 

Then, it can be easily verified that K x is a A4space. 

Some important special cases are given in the next example. 

5. (i) If X = K = R, then R R is the R-space of real valued functions 
defined on R. 

(ii) If X is the closed interval [a,b] = {x ER \ a <x where a,b 

are any real numbers and K = R, then is the R-space of real 

valued functions defined on the closed interval [a,b ]. 

(iii) If X = N = {0,1,2...} and K = R then R N is the R-space of all 
sequences {oc n } (see Exercise 3.3, No. 6). 

6. C is an R-space since C = {(a, P)\ a, p E R} = F 2 (R). 

7. Every field AT is a AT-space. Scalar multiplication of elements of K by 
elements of K is simply “ordinary” multiplication in K. Thus all the 
axioms for a AT-space are covered by the defining axioms of a field. 

In particular C is a C-space. Examples 6 and 7 show that the same set 
may be regarded as a vector space over more than one field. 

8. Let M m n (K ) be the set of all m x n matrices over AT. Addition of 
matrices and scalar multiplication of matrices by elements of K have 
been defined in Chapter I, § 1.4, where most of the axioms for vector 
spaces were verified. The remaining axioms are easily verified and 
M m n (AT) is a AT-space. M n (AT) denotes the vector space of all n x n 
matrices over K. 

The examples of vector spaces given above have come from various 
branches of mathematics, geometry, algebra, analysis and differential 
equations. The theory of vector spaces is therefore potentially of 
significance in the whole of mathematics. The notation V n (K), P(K ), 

P n (K ), R R , R '*1, M m n (K),M n (K) for these vector spaces will be 
standard for the remainder of this book. 

We now prove some elementary consequences of the definition of a 
vector space. 
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THEOREM 3.4 Let V be a K-space. Then 

(a) the zero element 0 E V defined in 3.3(a) (iv) is unique 

(b) if vE V, then (~v) defined in 3.3(a)(v) is unique 

(c) ifu,v,wEVand u 4* v = u 4- w then v=w 

(d) if u, vE V then u + x = v has a unique solution v — u in V 

( e ) ~(~ v ) ~ v for all vE V 

(f) O.v = 0 for all vEV 

(g) — (av) = (—a) v= a(—v) and (— a) (—v) — av for all a E K, vE V. 
PROOF (a) Let O' also be a zero element, then 

v 4- O' = v = 0' 4- v for all vEV 
Thus, in particular 

0 = 0 + 0' = 0' 4- 0 = 0' 
since v + 0 = v for all vEV, i.e. 0 = 0' 

(b) Let v also be an additive inverse of v, i.e. 
v v = v + v — 0 

and v 4- (~v) = (~v) 4- v = 0 

Then v = v 4- 0 = v 4- (v + (— v)) = (v r 4- v) 4- (— v) 

= 0 4- (—v) = —v 

(c) If u 4- v = u 4- w then 

(— u) 4 (u 4 v) — (—u) 4- (u 4- w) 
i.e. ((— u) 4- u) 4- v = ((— u) 4- u) 4- w 

i.e. 0 + v ~ 0 4- w i.e. v = w 

(d) Since u 4- ((— u) 4- v) = (u 4 (— u)) + v=0 + v=v,u+x = v 
has at least one solution. This solution is unique, for if x and x are 
solutions then u + x = u 4- x and by (c) x = x'. This unique solution is 
denoted by v — u. 

(e) If vEV, we have (~v) 4- (-(-!>)) = 0 = (~v) 4- v, which by (c) 
implies —(—v) = v. 

(f) If v E V, then O.v = (0 4- 0) . v = 0 . v 4- 0 . v, but 0 . v = 0 4 0 . v 
and by (a) 0 = O.v. 

(g) If a E K, v E V then (av) 4- (—(az>)) = 0 and v + ( — (z>)) = 0 
implies av 4- a(— v) = a.0 = 0 and by (b) a(— v) = —(av). Also, 

a + (—a) = 0 and so (a 4- (— a)).v= 0.v= 0 by (f). Thus 
av+ (—a)v= 0 = av + (—(av)) and so (— a)(z>) = — (av). Finally, 

(—a) (— v) = a(—(—v)) = av by (e). I 
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Note The reader should determine exactly which of the defining 
axioms for a vector space are used in the various parts of the above 
proof. 

3.3 Subspaces 
Let Vbe a A4space. 

DEFINITION 3.5 A non-empty subset W of V is called a subspace of 
V if W is itself a K-space with the same definition of addition and scalar 
multiplication as in V. 

By inspecting the defining axioms for a vector space, we note that 
some of the axioms are automatically true in W if W is a subset of V, 
i.e. 3.3 (a)(ii), (iii) and (b)(ii), (iii), (iv) and (v). Therefore, to verify 
that a subset W of V is a subspace we need only show that 

(i) if w, vE W then u + vEW 

(ii) 0EW 

(iii) for each vE W, —vE W 

(iv) if a E K, v E W then avEW. 

In fact, we can prove 

LEMMA 3.6 A non-empty subset W of V is a subspace of V if and 
only if au + vEW for all aE K, u, vE W. 

PROOF If W is a subspace of V, then by definition, if a E K, u, v E W, 
then au E W and au 4- vE W. Conversely, since W is non-empty, there 
exists an element uEW and thus (— \)u + u = —u 4- u = 0 E W and 
hence if a E K, v E V a.v 40 = avE W and in particular —1 .v — —vE W. 
Finally, if w, vE W, \u + v= u + vEW. That is (i)-(iv) above have been _ 
verified, and IP is a subspace of V. 

EXAMPLES 

1. Every vector space V has two subspaces, namely V and {0}. Every 
other subspace is called a proper subspace. 

2. Let W = { (0, a 2 ,..., a n ) \ a t E K (i = 2,..., n)} C V n (K), then W 
is a subspace of V n (K), since 

(i) W is non-empty, for example (0,..., 0) E W 

(ii) if a E K, u = (0, a 2 ,..., a n ) 9 v = (0,0 2 ,..., P n ) € IP, then 
au 4 v = (0, aa 2 + j3 2 ,..., aa n + f3 n ) E IP. 

Let W= {(1 + a 2 , a 2 ,..., ajlo^e K (/ = 2,...,«)} £ V n (K), 
since, for example, (2, 1,0,..., 0) E IP but 2(2, 1,0,..., 0) = 

(4, 2, 0,.. ., 0) §£ IP, IP is not a subspace of V n (K). 
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3. Let M< S \K) = {A eM n (K)\A x =A} CM n (K), then M^(K) is a 
subspace of M n (K) since 0 G M„ (s) (K) and if a G K, A, B G Af„ (s) (K) then 

(avl +B) X = aA x + B x = ctA+B 

i.e. aA+B ei n (s) ®. 

4. Let V be the set of solutions of the system of m linear homogeneous 

n 

equations in n variables 2 a.. jc. = 0(i = 1,..., m), or AX = 0, where 

/=i 

A ~ ( a i /)> x ~ (^i> • • • , Then L is a subspace of V n (K), called the 
solution space of the system, i.e. 0 E L and if a E K, X, Y E V then 

A(olX+ Y) = + 

= 0 

and aX + y G K This solution space will be considered later when the 
vector space theory to be developed will be applied to linear equations. 
Indeed, the problem of determining the solution of a system of linear 
homogeneous equations may be rephrased as the determination of the 
solution space of the system. 

5. Let C\a,b] be the set of real valued continuous functions defined on 
the closed interval [a,b] , then C[a,b ] is a subspace of RM1. In fact, 

C[a,b] ,aGR, then of + g E C[a,b] . This follows from 
theorems in elementary calculus, which prove that if /and g are 
continuous functions then af + g are continuous (see for example 
A.S.-T. Lue, Basic Pure Mathematics II, VNR New Mathematics 
Library 5, p.51). 

Let C'\a,b ] be the set of all continuously differential functions 
defined on a closed interval [a,b] , i.e. the set of all functions which 
possess a continuous derivative on every point in [a,b] , then again 
theorems in elementary calculus show that C'[a,b] is a subspace of 
R [a ’ b 1. The question arises whether C’[a,b] and C[a,b ] are the same 
spaces; in fact, C \a,b ] C C\a,b\ , as a differentiable function is 
continuous. The inclusion is strict; since, for example, the function 
/(•*-) — W is continuous but has no derivative at x — 0. Sometimes, we 
are required to deal with functions which may be differentiated a 
number of times; we therefore denote by C n [a,b] the set of all n 
times continuously differentiable functions on [a,b] and by C°[a,b] 
the set which may be differentiated any number of times. Again, 
it may be verified that C n [a,b] (t 2 > 1 ) and C°°[a,b] are R-spaces (i.e. 
they are subspaces of Rt* 2 ’^ 1) and that if m^n, C m is a subspace 
of C n [a,b] . We shall add C[a,b ], C n [a,b] and C°°[a,b ] to the list of 
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standard notation for vector spaces given on p. 69. In addition 
C(R), C"(R), C°°(R) will be used when [a,b] = R. 

6. Let Fbe the subset of C 2 [a,b] which are solutions of the fixed 
homogeneous linear differential equations 

+ <*) ^ + b(x)y = 0, 

where a(x ), b(x) E C\a,b \. lff,g E V, ot E R then 
a ^2 (f ■*)(*) + «(*) / (f+g)(x) + b(x)(f +g)(x) 


~ + a ^ X> Sc ^ + b(x ^ (x ^) 

+ (/- 2 S(x) + a(x) A g ( x ) + b(x)g(x)'j 


= 0 

where again some theorems in elementary calculus have been used ( loc. 
cit p.57). ThusZ + g ^ V and similarly af E V and Lis a subspace of 
C 2 [a,b ]. Lis a solution space of the above differential equations, which 
is an important example of vector spaces. Sometimes additional 
“boundary conditions” must be satisfied, for example, if U <C 2 [0,7r] 
is the solution space of the differential equation 



+y =0 


with boundary conditions y(0) - 0,y(7r) - 0. Then it is easily verified 
that U is a subspace of C 2 [0,7r]. 

We now consider ways of constructing some important subspaces of 
a vector space. If U and W are subspaces of L, then the intersection of 
U and W, U n W is also subspace, for 0 E U n W and if a E K, 
u,vEUnW then w, vE U and u , vE W and olu + vE t/and 
CLU + vE W, which implies that au + vEUnW. But, in general the 
union of U and W,UUW is not a subspace of L, for example, 

U — {( ol , 0) | a E K } and W = {(0, p)\j3EK} are both subspaces of 
L 2 (tf), but for example (1,0) + (0, 1) = (1, 1)£ UU W. The position 
is rescued by the definition of the sum of two subspaces, i.e. 

U + W = {u + w\u E U, w E W }, which is easily shown to be a subspace 
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of F In fact, this can be extended to cover the sum of more than two 
subspaces. 


DEFINITION 3.7 Let W lf W 2 ,...,W k be subspaces of a vector 
space V over K. Define 

W = W x 4-. .. + W k = {wi + vv 2 + . .. 4 wJw.E 

O' = 1, • • •, k)} 

then W is called the sum of the subspaces W u ..., W k . 

Then we can prove by using Lemma 3.6, 

LEMMA 3.8 (i) W x n W 2 n ... Pi W k is a subspace of V 

(ii) Wi + W 2 + ... + W k is a subspace of V. 

PROOF (i) Since 0GH'.(i=l,...,fc),0EH/ 1 nH' 2 n,,,nii/ jfe 
and so it is non-empty. If a E K, u, v E W x n W 2 n .. . O W k , then 
u 9 v E W t (i = 1,..., k) and au + v E W t (i = 1,.. . , k) and 
otu ■+■ v e h/j n w 2 n ... n 

(ii) 0 = 0 + ...4-0EW 1 +W 2 +...4-W /c since 0GHJ(i=l,...,t). 
Further, if a E A', m, z> E 4- JP 2 4-... + W k , then u = Ui 4-... 4- u k , 

v = v x 4*... 4- v k , where u f ,^^(1=1 . k). Thus, 

cxUj + v i €W i (i= 1,... , fc) and it follows that 

otu 4- v = (X (u ! 4- ... 4- u k ) 4- (v x 4- ... 4* v k ) 

= (au x 4- v t ) 4-... + (ocu k 4- v k ), 

that is, otu 4- v E W x 4-.. . 4- W k . i 

Now, let S = [v X9 ..., v k ) be a finite subset of a vector space F Let 

U = | £ ot i v- | (x i E Arj. Then, we can prove 

LEMMA 3.9 U is a subspace of V. 

PROOF 0 E U, since 0 = 0 ,v x 4-. . . 4- 0 .v k E U. 

k . k 

If a E K, u, v E U then u = £ a f - ^ and z; = £ /3- for some 

i=i i=i 

a i9 j3 f EK (i = 1,..., k). Then 


OLU 


+ V = a ( 2 a,. »,) + ^ 2 0,- »,) 


= 2 (aa i + P i )v i GU 
1=1 

and so, by Lemma 3.6, f/is a subspace of F 
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DEFINITION 3.10 7/5 — {z> l5 . . . , v k }Q V, then £ v i , where 

i=i 

Oi t ^K is called a K-linear combination of S. The subspace U constructed 
above is called the subspace generated {or spanned) by the subset S. We 
shall denote this by U = (S) or {V \ 9 . .. , v k ). A subspace Uof V which is 
generated by a finite subset of V is called a finitely generated subspace 
of F These ideas will be illustrated by examples to be considered in the 
next section. 

(Definition 3.10 can be extended to cover infinite subsets. If S is a 

subset of F, let U = j 2 a s s | a s E K\, where in each sum, all but a 

\seS j 

finite number of the oc s are equal to zero. Then U is a subspace of F 
and is the subspace generated by S). 

Exercises 3.3 

1. Which of the following subsets of V n (R) (n > 3) are vector spaces? 

(0 {( a i> • • • j & n ) I 4- a 2 4- a 3 = 0 } 

(ii) { (a l9 a 2 ,. . . , a n ) | oq 4- a 2 4- a 3 = 1 } 

(hi) {(otu ol 29 . . . , a n ) | a 3 = oq a 2 } 

(iv) {(a 1 ,a 2 ,...,a„)||a 1 |>0} 

(v) {((*!, 02 ,, oc n ) \oii-a 2 = ol 2 ~ a 3 } 

2. Which of the following subsets of R R are R-spaces? 

(i) The set of all real polynomials of degree exactly n 

(ii) The set of all real polynomials of degree < n 

(iii) The set of all odd functions (i.e. f(~x) = —f(x) for all jc E R) 

(iv) The set of all differentiable functions. 

3. Which of the following subsets of C[0, 1] are R-spaces? 

(i) {/E C[0,1] |/(1) = 0} 

(ii) {/E C[0,1] 1/(1) = 1} 

(iii) {/E C[0,1] 1/(0) = /(1)} 

(iv) {/ec[0,i]|/ o 1 /(r)dr = 0} 

(v) !/ec[0,l] IJJ/ZO )dt = 1} 

4. Which of the following subsets are subspaces of M 2 ( R)? 
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(iv) 

{A 

e m 2 (R) i 

i det A 

= 0} 

(v) 

{A 

e m 2 (R) 

| det A 

= 1} 

(vi) 

{A 

gm 2 (R) 

I AB = 

BA for a fixed B G M 2 (R)} 

(vii) 

{A 

G# 2 (R) 

t A 2 =. 

A}. 

Which of 

the following subsets are subspaces of M 3 (R) 

(i) 

{A 

GM 3 (R) 

| detA 

= 0} 

(ii) 


gm 3 (R) 

1 2 a 
1=1 

„=»} 

(iii) 

{A 

gm 3 (R) 


—A}. 


6. Let Vbe the set of all sequences {a n } = (olq, oc u . .., a n) ... ) of 
elements of R. Define addition of sequences by 

{<*„} +{£„} = i a n + ^} 
and scalar multiplication by 

a {«„} = 

Prove that V is a vector space over R. 

7. Prove that the following are vector spaces over R 

0) {{ a «}l a «+l~ a n = a n+2~ <X n+V n>0 ’ Ol n GR ^ 
i.e. the set of all arithmetical progressions. 

(ii) {{oc n }\a n+2 = a n+l + a n ,n>0,a n £R}, 
i.e. the set of all Fibonacci sequences. 

(iii) the set of all convergent sequences 

i.e. {{a w }| for each e > 0 there exists an integer N such that | a n — a| <e 
for n >N for some a}. 

(iv) the set of all Cauchy sequences 

{{ a n }| for each e > 0 there exists an integer N such that I | < e 

for m, n >N}. 

3.4 Linear Independence, Basis and Dimension 
Let Vbe a^-space. 

DEFINITION 3.11 A subset {v u . . . , v k ) in V is linearly dependent 
over K if there exist ^,...,^6^ (not all zero) such that 

<*i Vi + a 2 v 2 + ... 4- a k v k = 0 

If not, { v u • • •, v k } is linearly independent over K, or in other words , 
if oc x V\ + cc 2 v 2 + ... + oc k v k = 0 implies that a i = 0 (/ = 7,..., k), 
then { Vi ,. . . , v k } is linearly independent over K. 
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EXAMPLES 


1. {(1,0,1) (1,-1,1) (2,-1,2) (0,0,1)} 
is linearly dependent over R since 

(1,0, 1) +(1,-1, 1)-(2,-1,2)+ 0(0,0, 1) = (0,0,0) 

but {(1,0, 1), (0,0, 1)} C F 3 (R) is linearly independent over R, 

forif a(l,0, 1) + )3(0, 0, 1) = (0,0,0), 

then a = 0, a + j3 = 0, or a = 0, /3 = 0 

2. We have seen in examples 5 and 6 in §3.2 that C is an R-space and a 
C-space. 

1 + i and i are linearly dependent over C, i.e. 1 (1 + i) + (-1 + i)i = 0, 
but linearly independent over R, for if a(l + i) + j3i = 0, a, /3 G R, then 
a = 0 and a + /3 = 0ora = /3 = 0. 

(Definition 3.11 can also be extended to cover infinite subsets, a subset 
S ^ V is linearly independent over K if every finite subset of S is 
linearly independent over K, otherwise S is linearly dependent over K). 

DEFINITION 3.12 A subset S + V is a K-basis for V if 

(i) S generates V 

(ii) S is linearly independent over K. 

If S = {t>!. v k } is a finite set, this means that ifvEV then 

k fc 

v= 2 a,- v t , where a ( EK (i= 1,... ,k) and if 2 ft v, = 0, 

_'=i i=i ' ' 

Pi G K (/ = 1,..., k) then j3 ( . = 0 (/ = 1,..., k). 

We shall concentrate on this case, that is, finitely generated AT-spaces. 
Before illustrating these ideas with examples, we give a useful 
determinantal criterion for the n columns of a square matrix to be 
linearly independent. We prove 

THEOREM 3.13 If A = (a, y ) G M n (K) and Cj = (a 1/f a 2/ ,..., %j ) 

(J = 1,2,..., ri) are the n columns of A then {c u c 2 ,.. ., c n ) is 
linearly dependent overK if and only if det A = 0. 

PROOF If {c u c 2 ,..., c n } is linearly dependent over K, then there 
exist <*!, <* 2 ,..., ct n G K (not all zero) such that 

aqc! + a 2 c 2 + ... + a n c n = 0 

Without loss of generality we assume that ttj ¥= 0 then 

i CK2 . QL n QL') 

c i + — C 2 + ... + — c = 0. Now, subtracting — times the second 

OCi CLx ° oci 
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column + ... + — times the nth column from the first column gives a 

matrix with first column zero. Thus by Remark 2 in §2.2, det A = 0. 

If det A = 0, then by Corollary 3 to theorem 2.7, the system of 
linear equations 

n 

2 a i/ x j = 0 (i = 1,2,... ,n) 

7 = 1 

has a non-trivial solution (x 1? x 2 ,. .. , x n ), that is, there exist 
x u x 2) . . . , x n G K (not all zero) such that 

*i^i +x 2 c 2 + ... +x n c n = 0 

which shows that {c l5 c 2 ,. .., c n } is linearly dependent over K. J 
By considering the transpose A t of A, we obtain 

COROLLARY The n rows of a matrix A eM n (K) are linearly 
dependent over K if and only if det A = 0. 

EXAMPLES 

1. Lete! =(1,0, 0), e 2 = (0, 1,0), e 3 = (0, 0,1) then {e 2 , e 2 , e 3 } is an 
R-basis for the R-space F 3 (R), since 

(i) if v = (oq, a 2 , a 3 ) is an arbitrary element in E 3 (R), then 

(a 1 ,a 2 ,a 3 ) = ai(l,0, 0)+a 2 (0, l,0)+a 3 (0, 0,1) 

that is {ej, e 2 , e 3 } generates F 3 (R). 

(ii) if ej +a 2 e 2 +a 3 e 3 = 0, then (aj, a 2 ,a 3 ) = (0, 0, 0), or 
=a 2 =a 3 = 0, that is {e 1 ,e 2 , e 3 } is linearly independent over R. 

Let Vi = (1,0, 1), v 2 = (2, -1,1), v 3 = (4, 1,1), then {v u v 2 , v 3 } is 
also an R-basis for F 3 (R). It can easily be verified that 

(oq , Oi 2 , ol 3 ) = %{(—2a 1 + 2a 2 +6 a 3 )v : +(o:i - 3a 2 -oc 3 )v 2 
+ (a t + a 2 -a 3 )v 3 } 

that is, {Vu v 2 ,v 3 ) generates K 3 (R). Also, {^,^ 2 ^ 3 } is linearly 
independent over R, for if 

c^i + a 2 v 2 +ot 3 v 3 =0 

where oti , a 2 , a 3 G R, then 

aj(l, 0, 1) +a 2 (2, -1, 1) + a 3 (4, 1,1) = (0, 0, 0) 

or oq + 2a 2 + 4q: 3 = 0 

—a 2 + a 3 = 0 

0:1 + o: 2 + a 3 = 0 
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Since det 


/1 2 4' 

0 -1 1 = 4 f 0, the trivial solution = a 2 = oc 3 = 0 

\1 1 1 


is the unique solution of this system (see Theorem 3.13). 

This example shows that an R-basis for a vector space is not unique. 

2. Let ei = (1, 0, .. . , 0), e 2 = (0, 1,0,. .., 0),.. . , 

Q n ~ (0, 0, .. . , 1) E V n ( K ). If (oq , Oi 2 , . .. , oc n ) is an arbitrary element 
in V n (.K ), then 

n 


(“i. •••>“„)= 2 e ( . 

1=1 


that is 


{ e !, . . ., e n j generates V n (K ) 


n 


If , ?i e '’ = ° then • • •» a „) = (0, • ■ •, 0), or 


oif — 0 (/ — 1. n), that is, {e 1( ..., e n } is linearly independent over 

K. Thus {e 1; ...,e n } is a AT-basis for V (K), called the standard /f-basis 
for V n (K). 

3. Let V = M 2 (R) and let 


e n ~ 


e 21 — 


1 0 
0 0 
f o 0 
1 0 


e 12 “ 


e 22 — 


f 0 1 
0 0, 
0 0 
0 1 


then {e u , e 12 , e 2I , e 22 } is an R-basis for Af 2 (R), since 
B 


(0 if 


y 8 
a |3 

y 8 


E M 2 (R) then 


= ae n +j3e 12 + 7 e 21 + 5e 22 


and {e n , e 12 , e 21 , e 22 } generates M 2 (R) over R. 

(ii) to show that this set is linearly independent over R, consider 


<* e n + 0e 12 + 7 e 21 + 6e 22 = 


0 0 
0 0 
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- (* ') - (° °) 

\7 8/ \0 0/ 

which implies that a = 0 = 7 = 5 = O. 

4. We now show similarly that S = { e i; - 1 i, j = 1, ..., n} is a A'-basis for 
M n ( K ), where e. ; . is the n x n matrix with 1 in the (/,/>position and 
zero elsewhere. If (a^) is an arbitrary matrix in M n (K ) then 


= 2 e f y 

i,f-l 

and so S generates M n (K) over K. Consider 

n 

y a- e*. = 0 

** ij ij 

*\/ = l 

where a. ; . GJ^(i,/=l .«), then (a if ) = 0 , which implies that 

a .. = 0 (/,/' = 1,..., n) and S is linearly independent over K. 

5. Let V = K 2 (C). In Example 2, we saw that {(1, 0), (0, 1)} is a 
C-basis for V. We now show that S = {(1,0), (i, 0), (0, 1), (0, i)} is an 
R-basis for K 2 (C) regarded as an R-space. 

F 2 (C) = {(O'! + a 2 i, 0i + 0 2 i) | a u oc 2 ,0 1# 0 2 e R}. Since 

(c*! 4- a 2 i, 01 + 0 2 i) = ai(l, 0) 4- a 2 (i, 0) + 0i (0, 1) 4- 0 2 (O, i), for all 

<* 1 , <* 2 > 0i, 02 e we have that S generates V 2 (C) over R. Consider 

o^i (1,0) 4- a 2 (i, 0) 4- 0i (0, 1) 4- 0 2 (O, i) = (0,0) 

where o^, a 2 ,0i, 0 2 G R, then 


which implies that Oi 4 a 2 i = 0,0! 4 0 2 i = 0 or = a 2 = 0i = 0 2 =0 
and S is also linearly independent over R. 

6 . It can be proved in a similar manner that 

{( 1 , 0 , 0 ,.. . , 0 ), ( 0 , 1 , 0 ,..., 0 ),. .. , ( 0 , 0 ,. .., 0 , 1 ), (i, 0 ,..., 0 ), 
(0, i, 0,. .., 0),... (0,0,. .., 0, i)} is an R-basis for V n (C) regarded as 
an R-space. 


7. Find an R-basis for the solution space of the system of linear equations 
x — 2y4 3z— w = 0 
2x 4y — z 4 w = 0 
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The reduced echelon matrix of this system is easily shown to be 
1 0 
0 1 

and the equations become 


x = 


j z ~j w 


y = 

If we give z,w the arbitrary values X, fi respectively, then the general 
solution of the system is 

(x,y,z,w) = (—X—m, 7 A -3 n, 5 A, Sjjl) 

= A(-1,7,5,0)4 M (-1,—3,0,5) 

i.e. the solution space is < (-1, 7, 5, 0), (-1,-3, 0, 5)> which is clearly 
an R-basis. 

8 . Find an R-basis for the solution space of the differential equation 
d 2 y 


dx : 


+ y = 0 or y +y = 0 


(*) 


where y E C°°(R). This may be solved using elementary calculus as 
follows: Multiplying throughout by y gives 

y y +yy =0 

and integrating 

y' 2 +y 2 =c 2 

where c is an arbitrary constant. Thus we have that 

y' = \]c 2 -y 2 or - ^ = dx 

\/c 2 -y 2 

which may be integrated to give 

sin " 1 - = x 4 d 
c 


or 

y - c sin (x 4 d) 

where d is also an arbitrary constant. Thus, we have that 
y - c cos d sin x 4 c sin d cos x 
= 7 sin x 4 5 cos x 

where 7 , 5 E R and the solution space of (*) is < sin x, cos x >. 
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Now {sin x, cos x} is linearly independent, for if 
oc sin x + (3 cos x = 0 

with a, j8 G R, then by putting x = 0 and x = n/2, we obtain that (3=0 and 
a=0 respectively. Thus {sin x, cos x} is an R-basis for the solution space 
of (*). 

Remark Note the similarity in approach to the problems of solutions 
of linear homogeneous equations and linear differential equations. 

Whereas we have given a systematic method for obtaining a A^-basis for 
the solution space of a system of linear homogeneous equations the 
corresponding problem is not considered for differential equations. 

The first principle method for solving Example 8 is given for illustrative 
reasons only - for a comprehensive treatment of the application of 
linear algebra to differential equations and analysis see An Introduction 
to Linear Analysis by D.L. Kneider, R.G. Kuller, D.R. Ostberg and 
F.W. Perkins (Addison-Wesley). 

Let 38 = {v x , v 2 ,..., v n } be a A'-basis for V. If v€ V, then 
v= 2 a- v { , where a, G K (i = 1,2,..., n) are uniquely determined. 

i=i 

The n-tuple (a u a 2 , • • • , oc n ) G V n ( K ) are called the co-ordinates of v 
relative to the A^-basis 38, sometimes denoted by [v]%. 

EXAMPLE 

In Example 1 above the co-ordinates of the vector (1, 1, 1) 

relative to the R-basis {v x , v 2 , v 3 ) are (i, — i), i.e. 

( 1 ,- 1 , 1 ) = hvi + lv 2 -bv 3 

We have seen in Example 1 that in general a X-space may have more 
than one X-basis; in that example we saw that both AT-bases have the 
same number of elements. The next theorem shows that this is always 
true. We prove 

THEOREM 3.14 Every K-basis of a finitely generated K-space has the 
same number of elements. 

PROOF We first show that if V is a A^-space generated by a fmite set 
of vectors {v x , ..., v n }, and if {y x ,y 2 ,... ,y m } is a linearly independent 
set of vectors in V then m < n. 

Suppose that m>n, we show that this leads to a contradiction of 
the linear independence ol{y x ,y 2 ,... ,y m }. Since y- EV(J = 1,.. . ,m), 
then 
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n 


y,- = 2 «u v i 

i= 1 


where a if € K. Then, if € K (J = 1,..., m) 

m m . n v 

^y, y/= 2^(2 % vA 

1=1 / = ! \/ = l / 


n , m s. 

2 ( 2 % (3 ) v t 

i = l \j = l / 


i.e. 


Since n Km, by Corollary 3 to Theorem 1.7, there exist 

m 

0i, • • •, P m € K (not all zero) such that 2 a if (3- = 0 which in turn 

m / = 1 

implies that 2 ^y. = 0 and {y x ,. .. ,y m } is linearly dependent over 
/—i 

K. This is our required contradiction. 

Now assume that {v u .. . ,v n ) and {y u ... ,y m } are AT-bases of V. 
By regarding {y x ,... ,y m j as a linearly independent subset in the 
space generated by {v x , ... , v n } the above argument shows that mKn. 
Similarly, by regarding {v x ,.. ., v n } as a linearly independent subset in 
the space generated by {y x ,.. . ) y m } it follows that n < m. These two 
results together imply that m = n, as required. I 

This theorem now allows us to make the following important 
definition. 


DEFINITION 3.15 The number of elements in a K-basis for a finitely 
generated K-space V is called the dimension over K of V denoted by 
dim V or (V: K). V is called a finite dimensional vector space. 

EXAMPLES 

By scrutinizing the examples preceding the above theorem we see that 
(0 (F 3 (R) : R) = 3 
(ii) (V n (K) .K) = n 
(hi) (V n (C):K) = 2n 

(iv) (M n (K) : K) = n 2 

(v) (M n (C):R)=2n\ 

Our next theorem will show that a AT-basis exists and at the same 
time gives a practical method for obtaining a AT-basis V. We first prove 

LEMMA 3.16 If {v x ,..., v n } is a linearly dependent set over K in a 
K-space then at least one of the vfs is a linear combination of the 
vectors preceding it. 
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PROOF If {Vi ,..., v n } is linearly dependent over K, then there 
exist a t E K (i = 1,..., n), not all zero, such that 


n 


2 a* = 0 

i = l 


Let j be the positive integer such that 1 </ < n and a } - =£ 0 and 
“/• + ! = a /+2= ••• = “« = °> then 


Vf = - — 1 Vi 
/ a. 

/ 




<*/-1 


OL . 
7 


Vi 


and the lemma is proved. 


I 


THEOREM 3.17 Let V be a finitely generated K-space , then 
(0 every generating set for V contains a K-basis for V, and (ii) every 
linearly independent subset of V can be extended to give a K-basis for 
V, i.e. if {y lt . .. , y m ) is linearly independent over K, there exist 
vectors x lt ..., x n _ m in V such that {y x ,... ,y m ,x i,... , x n _ m } is a 
K-basis for V. 


PROOF (i) Let S= {x u ..., x m } generate V over K . If S is 
linearly independent over K then S is a A-basis for V. If not, then by 
Lemma 3.16, for some l</<m, x- is a linear combination of the 
vectors preceding it. If S' = S\{Xj} , then S' also generates V over K. 

If S' is linearly independent over A then this is our required A-basis. 

If not, repeat the above procedure on the set S' and indeed, if necessary, 
repeat the procedure until a linearly independent set is obtained. Such a 
linearly independent set exists since, for example, any set containing one 
vector is linearly independent. 

(ii) Since V is finitely generated, suppose that {v u .. . , v n } 
generates V over K. Let {>>!,. .. ,y m } be a linearly independent set of 
vectors in V, then by the proof of Theorem 3.14, m < 

Consider the set 

s ={y i,... ,y m ,vu ... ,v n ) 

then S generates V. If S is linearly independent over K then it is the 
required A'-basis. If not, then by Lemma 3.16, one of these vectors is 
linearly dependent on the vectors preceding it. This must be one of the 
z;., otherwise, if one of th ey f is a linear combination of the vectors 
preceding it, we have a contradiction of the linear independence of 
{y u . . . ,y m }; let this vector be V-. Let S' = S\{v -}, then clearly S' 
generates V over K. If 5 ' is linearly independent over A, we have the 
required A'-basis for V. If not repeat the above procedure on the set S\ 
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again one of the remaining v i must be a linear combination of the 
vectors preceding it. Since by (i) every generating set for V contains a 
A'-basis for V 9 this process must eventually terminate and the resulting 
set is our required A'-basis which contains {y u ... ,y m }. 

The following corollary is easily obtained. 

COROLLARY 1 Let V be a finite dimensional K-space with 
(V: K) = n, then (i) any set of n linearly independent vectors in Vis a 
K-basis for V; (ii) any set of n vectors which generate V is a K-basis 
forV. 

PROOF (i) By the above theorem, every linearly independent set of 
n vectors can be extended to give a A'-basis for V and since (V : K) = n 
it follows that this set must be a A-basis for V. 

(ii) Again, by the above theorem, every set of vectors which 
generates V contains a A-basis for V and since (V:K) = n this must . 
be our required A-basis for V. 

This corollary shows that if (V : A) = n, to verify that a set of 
vectors S = {^i,..., v n } is a A-basis for V we need only show that 
either 

(i) S is linearly independent over A 
or 

(ii) S generates V over A. 

We illustrate this theorem and its proof by considering the following 
example. 

EXAMPLE 

Let Kbe the subspace of K 4 (R) generated by {v x = (1, 1,2,4), 
v 2 - (2, —1, —5, 2), v 3 = (1, —1, —4,0), v 4 = (2, 1, 1,6)}. Find 
(i) an R-basis for V, (ii) an R-basis for F 4 (R) which contains this 
R-basis for V. 

(i) Consider 

<* 1^1 + & 2 V 2 + <* 3^3 + <* 4^4 = 0 

or equivalently 

oq + 2 a 2 + a 3 + 20L4 = 0 

aq — a 2 — a 3 + = 0 

2oti — 5a 2 — 4a 3 + a* = 0 

Aa .1 + 2a 2 + 60^ = 0 
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We have 


1 1 2 1 2\ 

1 - 1-11 
2-5-4 1 

\4 2 0 6 / 

/l 0 — i §\ 

_ 0 1 \ 4 

0 0 0 0 

\o 0 0 0 / 

which implies that 

0(1 = i a 3 - f a* 
a 2 = -fa 3 — ia, 


1 1 2 1 
0 -3 -2 

0—9—6 
^0 —6 —4 



2 1 2 
3 2 1 

0 0 0 
0 0 0 


If we put a 3 = — 1 ,04 = 0 and a 3 = 0, a* = — 1 respectively we obtain, 
t >3 = ~\v x + \v 2 
v 4 = 3 V x + i v 2 


Thus V = (v u v 2 , v 3 , v 4 > = < v u v 2 > and {v u v 2 } is an R-basis for V. 

(ii) Let S = \v u v 2 , e 1( e 2 , e 3 , e 4 }, where {e 1( e 2 , e 3 , e 4 } is the 
standard R-basis for K 4 (R). 

Since 


Vi = (1, 1,2,4) = e x 4- e 2 + 2e 3 4- 4e 4 
v 2 = (2,—1,-5, 2) = 2ej — e 2 — 5e 3 + 2e 4 


we have 

2e 3 4* 4e 4 = V\ e^ c 2 
5e 3 —2e 4 = — v 2 + 2e 1 — e 2 

or e 3 = -fa(Vi — 2v 2 + 3e l — 3e 2 ) 
e 4 = ^(5^ + 2v 2 -9q 1 -3e 2 ) 
which implies that 

F 4 (R) = (v Xi v 2 , e u e 2 , e 3 , e 4 > = (v l9 v 2 , e u e 2 ) 
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(Alternatively, using the above corollary, we need only show that 
{v Xi v 2 , e l5 e 2 } is linearly independent over R). 

We now proceed to obtain a useful theorem which connects the 
dimensions of subspaces V x and V 2 of a A:-space V with the dimensions 
of the subspaces V x + V 2 and V x n V 2 . We first need to establish, 

LEMMA 3.18 Every subspace of a finite dimensional K-space V is 
finite dimensional over K of dimension < (V: K). 

PROOF Let If be a subspace of a AT-space V. Let S be a linearly 
independent subset of W. Then S may be regarded as a subset of V and 
since V is a finite dimensional vector space, it follows from the proof of 
Theorem 3.14 that S contains a finite number of elements and so If is 
finite dimensional over K and (W: K) < (V : K). I 

THEOREM 3.19 Let W x and W 2 be subspaces of a finite dimensional 
K-space V, then 

(W x :K) + (W 2 \K) = (W x CiW 2 :K) + (W x + W 2 :K) 

PROOF By Lemma 3.8, Ifj n Pf 2 is a subspace of V which by 
Lemma 3.18 is finite dimensional over K. If (If i n W 2 :K) = r<(V:K), 
let {z>!,. . . , v r } be a ^f-basis for W x n W 2 . Since W x n W 2 is a subspace 
of W x and Jf 2 , by Theorem 3.17 (ii), {v u ..., v r ) can be extended to 
give AT-bases for W 1 and W 2 respectively; suppose that 
{v u .. .,v r ,x u .. . ,x„}and {v l9 . . .,v r ,y lf . .., y m } are AT-bases 
for W j and W 2 respectively, then (W i: K) = r 4- n 9 (W 2 \K) = r + m. 

We show that S = [v u ..., v r , x u ..., x n ,y u ... ,y m ) is a AT-basis 
for W j 4- Jf 2 , and (Pfj 4- W 2 : K) = r + n + m and this will complete the 
proof. 

We certainly have that S generates W x 4- W 2 over K. We need only 
show that S is linearly independent over K. Consider 

r n m 

Z «iV t + 2 0 -X-+ 2 7 k y k = 0 

i = l / = 1 k =1 

Then we have 

r n m 

Z a, v t + 2 &,X, = - 2 y k y k £ w x n w 2 

'=1 /=1 fc=l 

Since {v u ..., v r } is a AT-basis for W x n W 2 , there exist 8 i e K(i = 1 
such that 

m r 

2 W* = 2 8 , = 0 

fc=l 1=1 
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Since now {v l9 ..., v r ,y u • • • ,y m } is a ^-basis for W 2 and is linearly 
independent over K f we have 7i = .. . = y m = Si = ... = = 0. 

Similarly, it can be proved that 

Pi = Pi = • • • = = 0 

and we have 

r 

2 <*,»,- = 0 
1=1 

which implies that ol- = 0 (/ = 1 ,. . ., r) since { v u . . . , v r } is a X'-basis 
for W x n JV 2 - That is 

ctf = 0 (/ = 1 ,..., r) 9 = 0 (/ = 1 ,..., n) 9 

y k = Q (k= 1 ,..., m) 

which proves that S is linearly independent over K and is a X'-basis for 
W x + W 2 . 

This theorem is illustrated by the following example: 


EXAMPLE 


Let Wt and W 2 be the following subspaces of F 4 (R), 

W t = {(a, 0 9 7 , 5)|a = 0}, W 2 = {(a, 0 , 7 , 5) I a + 0 =y 9 S= 2/3}. 

Then it is easily verified that 

{(1, 1,0,0), (0,0, 1,0), (0,0,0, 1)} is an R-basis for R^and 
{(1,0, 1,0), (0, 1, 1,2)} is an R-basis for W 2 . 

(( 1 , 1 , 0 , 0 ), ( 0 , 0 , 1 , 0 ), ( 0 , 0 , 0 , 1 ), ( 1 , 0 , 1 , 0 ), ( 0 , 1 , 1 , 2 )} generates 
W x 4 W 2 over R, but since 

( 0 , 1 , 1 , 2 ) = 2 ( 0 , 0 , 1 , 0 ) + 2 ( 0 , 0 , 0 , 1 ) + ( 1 , 1 , 0 , 0 )-( 1 , 0 , 1 , 0 ) 
and {( 1 , 1 , 0 , 0 ), ( 0 , 0 , 1 , 0 ), ( 0 , 0 , 0 , 1 ), ( 1 , 0 , 1 , 0 )} is also 
linearly independent over R it is also an R-basis for 4 W 2 . 

If (a, (3, 7 ,d)GW l n W 2i then 


a = 0 
20 = 8 
ot + P = 7 j 


P = a 
or 7 = 2a 
8 = 2 ol 


and ^ 0 ^ 2 = {(a, a, 2a, 2a) | a E R} and clearly {(1,1,2, 2)} is an 
R-basis of W x n W 2 . We note that (W l : R) = 3, (W 2 : R) = 2, 

{W x 4 - W 2 : R) = 4 and ( W x n W 2 : R) = 1, which shows that 


(W l : R) + (W 2 : R) = (W x + W 2 :R) + (W x nW 2 :R) 
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Exercises 3.4 

1. Prove that 

0) {0> 3), (5, 2, 1), (0, 1, 6 )} is an R-basis for F 3 (R), 

(ii) if t is a fixed real number and g x (x) = 1 , g 2 (x) = x + t 9 
ftW = (x + t) 2 then {^ 1 ,^ 2 ^£ 3 } is an R-basis forP 2 (R)- 

2. Prove that {(1,2, 0), (0, 5, 7), (- 1 , 1,3)} is an R-basis for F 3 (R) 
and find the co-ordinates of (0, 13, 17) and (2,3, 1) relative to this 
R-basis. 

3. (i) Show that {(3 + s/l, 1 + y/2), (7, 1 + 2 VI)} is linearly 
dependent over R but linearly independent over Q. 

(ii) Show that {(1 — i, i), (2, — 1 4- i)} is linearly dependent over C 
but is linearly independent over R. 

4. Show that(i){e r , sin t 9 r 2 }(ii){e', sin L cos t } are linearly independent 
over R. 


5. Show that the subset {(1,0, 1, 1), (1,0, 2,4)} of F 4 (R) is linearly 
independent over R and extend it to an R-basis for n(R). 


6 . If (w, z>, w} is linearly independent over C in a C-space V 9 prove that 

(i) {u + v, v 4- w, w 4- u} is linearly independent over C, 

(ii) {u + v — 3w, u 4* 3v — w, v + w} is linearly dependent over C. 


7. Prove that {( 1 , 1 ,0, — 1 ), (4, —2, 1 ,0)} is linearly independent over 
Q and find a Q-basis for V 4 (Q) containing these two vectors. 


8 . LetM= j(_^ andA-= {(* °)|>, 

be subspaces of M 2 (R). Find R-bases for Af, N, M n N and M + N. 


ye R 


9. If iS and T are subspaces of F 4 (R) defined by 
S = {(a, 0, 7 , 5) | a 4- 0 4- 7 = 0 } andT= {(a, p, 7 , 5) 1 7 = “ 6 }, 
find R-bases for S 4- T and S n T. 


10. Find an R-basis for the solution space of the equation 

x — 2y + z — 3t = 0 

11. If V l and V 2 are 2-dimensional subspaces in L 3 (R), prove that 
(v t n V 2 :R)>0. 

12. Let U be the subspace of L 3 (R) generated by the two vectors 

Wi = (1,2, 3), u 2 = (3, —5, 1 ). Show that ( 1 ,0, 0) is not in U but that 
(5, —23, —9) is in U. Express the latter vector as a linear combination 
of u 1 and u 2 . 
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13. Determine whether or not each of the following is an R-basis for 

w 

(i) {1, 1 4- t, 1 + t 4- t 2 , . . . , 1 4- t 4-. . . 4- t n }, 

(ii) {1 4- t, t 4- t 2 , . . . , t n ~ l 4- t n }, 

(hi) { 1,1 - r,. .. ,(1 —t) n }. 

14. Are the vectors u = (3, — 1,0, —1) and v = (1,0,4, — 1) in the 
subspace of F 4 (R) generated by {(2, —1,3, 2), (—1, 1, 1, —3), 

(1, 1,9, — 5)}? Hence determine two R-bases of E 4 (R) one containing 
u and one containing v. 


15. If S is a linearly independent set of vectors in V and prove 
that v < S > if and only if S U {v } is linearly independent. 

Show that the setS= {(1,0, —1, 1 ), (2, - 1 , 0 , 1), (1, 1,2, 1 )} is 
linearly independent over R and that x = (1,3, 3, 2) £ (S) and 
y = ( 0 , 1 , 1 , — 1 ) ^ <*S>. Find an R-basis for ( S) which contains x and a 
R-basis for K 4 (R) which contains y. 


16. Show that M 2 (C) may be regarded as a C-space and an R-space and 
determine its dimension in each case. In each case, determine the 
dimension of the subspace generated by 


(i !).(J K U !) • 


17. If S and T are subspaces of ^(C) generated by 

{(1, 1,0), (i, 1 + i, 1), (1 + i, 1 + i, 0)} and {(1,0, 1), (i, -i, 0), (0, i, i)} 
respectively, find C-bases for 5 O T and S 4- T. 

Regarding the above as real spaces, show that if S' and T' are the real 
subspaces generated by the above sets, prove that S' n T' — {0}. 

18. Let Vbe the subspace ofi^R) generated by the polynomials 
1 — t 1 + r 3 , 2 + t — t 2 + t 3 , 1 + 2t 4- 1 2 — t 3 . Show that 

f(t) = t + t 2 —t 3 GFbut£(f)= l + t — t 2 +1 3 $V. Find an R-basis for V 
which contains/(f) and an R-basis for /^(R) which contains g(t). 
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CHAPTER 4 


Linear Transformations 
on Vector Spaces 


4.1 Linear Transformations 

Let V and W be A^-spaces. 

DEFINITION 4.1 A linear transformation ( K-homomorphism ) T 
from V into W is a mapping T: V-+W such that 

T(av + v') = clT(v) + T(v') 

for all a E K, v, v G V, or equivalently 

T(y + v) = T(v) 4- T(v') 

T(av) = a T(v) 

for all a G K f v, v E V. If W = V then we say that T is a linear 
transformation on V. 

Essentially, a mapping from V into If is a linear transformation if it 
respects the two basic operations in a vector space, namely addition and 
scalar multiplication by elements of K, as illustrated in the following 
diagram. 



91 




















The following examples illustrate the breadth of the concept of a 
linear transformation— the first two show that it is a generalization to 
arbitrary vector spaces of some well known “transformations” in the 
plane K 2 (R), namely reflection in a line and rotation about a point. 

EXAMPLES 

1. Let L be a line through the origin and let T be the mapping which 
reflects each vector in L 



2. Let R be a rotation through an angle 6 about the origin 


y 
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3. Let T : L 3 (R)^ P 2 (R) he defined by 

T(ol u a 2 , Us) = 0*1 + ot 2 , oc 2 ~ a 3 ) 

then Tis a linear transformation of K 3 (R) into K 2 (R) since if 
(<*i, <* 2 , <* 3 ), (Pi, 02, 03) e L 3 (R), a E R then 

T(<x(oti, &2> a 3) "f (0i, 02,03)) — T(aoci 4- fa, aa 2 4- 0 2 , aa 3 4- 0 3 ) 

= (aaj 4- 0! 4- aoc 2 4- 0 2 , olcl 2 4- 0 2 — aa 3 — 0 3 ) 

= a(oti 4- 02 , ol 2 - ol 3 ) 4- (ft 4- fa, 0 2 - 0 3 ) 

= olT{oli , a 2 , o 3 ) 4- 7’(0 1 ,0 2 ,0 3 ) 


4. Let T : K 3 (R) -> F 2 (R) be defined by 

T(a u a 2 ,o 3 ) = (01 4- l,a 2 ) 

then T is not a linear transformation since for example T {\, 0, 0) = (2,0) 
and 7(2(1,0, 0)) = T( 2, 0, 0) = (3, 0) and 
T(2 (1,0,0)) *27X1,0,0) = (4,0). 

5. Let D:P n (R) ->i^(R) be the differentiation mapping defined by mapping 
a polynomial onto its derivative 

D(a 0 4- ol x x 4- . . . 4- a n x n ) = Oj 4- 2a 2 x 4- 3a 3 x 2 4-... 4- na n x n ~ l 

then D is a linear transformation on P n (R) since if 
a, o f ,0 f ER (z = 1,...,«) 

Z)(o:(oo4-aiX + . . . 4- a n x n ) 4- (Po + Pi* + . . . +P„x n )) 

= D((aa 0 4- fa) 4- (aa t 4- fa)x 4-... 4- (aa n 4- p n )x n ) 

= (aox 4- /3j) 4- 2 (aa: 2 4- 0 2 )x 4- ... 4- 4- P n )x n ~ 1 

= aD(a 0 4- oliX 4- ... 4- oi n x n )+D(fa 4- fax 4- . . .4- fax”) 

6. More generally, let D map a continuously differentiable function on 
[a,b] onto its derivative, that isD : C'[a,b ] -> C[a,b] is the derivative 
map. Then D is a linear transformation since if a E R, f x ,/ 2 E C' [#,&], 
then 

D(fi +f2) = D(f 1 )+D(f 2 ) 

D(af 1 ) = ) 

are restatements of well known properties of determinants (see A.S.-T. Lue, 
Basic Pure Mathematics II , VNR New Mathematics Library 5, p.56). 
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7. Integration of functions is also a linear transformation; more precisely, 
if / E C[a,b] define 

S(f) = f* f(0 dr 

where a < x < b. Then S : C[a,b] -+ C[a,b] and 

S(f!+f 2 )= 

a 

= f x fi(t)dt + f x f 2 (t)dt 

a a 

= %) + %) 

and similarly 

$(<*) = i) 

for/1,/2 E C[tf,Z?], a E R. 

8 . Let M and N be fixed m x m and /z x n matrices respectively. Define 
T 'Mm, n ( K )-*M mn (K) by T(A) = M4N for all A ^M mn {K\ then 
if A , B E Af m ^ w (K), and a E we have 

T(aA +B ) = M(<x4 + £)N 

= a!vL4N 4- M#N 

= oc 7 , G4) + r(£) 

i.e. T is a linear transformation on M m n ( K ) 

9. If K and IV are A%spaces, then /: V-> V defined by I(v) = v for all 
vE Vis a linear transformation on V called the identity transformation 
and 0 : V-+ IV defined by 0(zf) = 0 for all vE V is a linear transformation 
from V into IV called the zero transformation. 

Remark 1 T( 0) = 0 since by Theorem 3.2 we have 0 + 0 = 0 which 
implies 7"(0) + 7"(0) = T(0) and 7"(0) + 0 = 7"(0) and the result follows 
from the uniqueness of the zero element (Theorem 3.4). 

Remark 2 If v t EV,a i EK,(i = 1,. . . , n ), then 

T (■?! a ‘ V ‘ j = 

This result follows by repeated application of the definition of a linear 
transformation. 

We now prove a theorem which will be useful later when the existence 
of a linear transformation with certain properties is required. 
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THEOREM 4.2 Let Vand IV be K-spaces where (V:K) = n < °°. 

If {V\> • • • , v n } is a K-basis for Vand {w it .. . , w n ) are any n vectors 
in IV, then there exists a unique linear transformation T: V-+ IV such 
that T(v f ) = w. (i = 1,2,.. . , n). 

PROOF If v E V , then v= 2 a • v i , where a ( EK (i= 1 ,,n) are 

1=1 

uniquely determined. Define T : V -> W by 
T(v) = rf S a,®,j = 2 «,w, 

/ n H 

\{v,vEV,ocEK&ndv= 2 cc i v f = 2 P-v i9 where 

1=1 1=1 

oq, fyEK (i= 1 ,...,«) then 

T(az; + z>') = T | £ (aa z . + ft) 

n 

= 2 (<*<*,• + (3,.)w ( . 

1=1 

= ccT{y) + T(y’) 

and so T is a linear transformation of V into IV, which clearly has the 
property that T(v { ) = vv z - (/ = 1,... , n). If S : V IV is a linear 
transformation with the same property S (v f ) = w i (i= 1,.. . , n) then 

if v E V and v= 2 a z - a f EK (i = 1,...,«) then 

i = l 

S(v) = 5(2 <»/»<] = 2 <XjS(Vj) = 2 a,.w f = FO) 

U = 1 J i = 1 i=l 

and S = T, which proves the uniqueness of T. 

Exercises 4.1 

1. Determine whether the following mappings T : F 3 (R) -► F 2 (R) are 
linear transformations 

(i) T(a,$,y) = (a+ ( 3 - 7 , 2a+j3) 

(ii) T(a, (3, 7 ) = (a + 1, a + 2(3 - 7 ) 

(iii) T(a,l 3,7) = (Ia 1,0) 

(iv) rCa, (3, 7 ) = (a(3, (3a). 
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2. Determine whether the following mappings T :M n (K)-+M n ( K ) are 
linear transformations 

(i) T(A) = AS, where S is a fixed matrix in M n (K) 

(ii) T(A) = AS — SA, where S is a fixed matrix in M n (X) 

(Hi) T(A) = A' 

(iv) T(A) = A 2 . 

3. Determine whether the following mappings T : C' (R) -> 0(R) are 
linear transformations 

(i) T(f(x)) = f\x) 

(ii) T(f(x)) = J* fit) d t 

(iii) T(f(x)) = fix) fix) 

(iv) T(f(x)) = xf(x) 

(v) T(f(x)) = fix + 1). 

4. (i) Define T: C 2 ( R) -» C(R) by Tif(x)) = f"(x) - 2 fix) + 3. 

Show that Tis a linear transformation; 

(ii) If aix), b(x) £ C(R), define T : C 2 (R) -* C(R) by 
Tifixj) =f"(x) + aix) fix) + bix)f(x), 
and show that T is a linear transformation. 

4.2 The Matrix of a Linear Transformation 

Let V and W be finite dimensional vector spaces over K, where 
iV:K) = n, iW\K) = m. Let^= {v u ... ,v n } and H r = {w u ...,w m } 
be Af-bases for V and W respectively. For j = 1 ,..., n. Tv, £ W and thus 

m 

TVj = 2 % w ( . 

1=1 

where a G K are uniquely determined. Let A = (a .•) E M m n (AT), then 
A is called the matrix of T relative to the -bases ^and ^and is 
sometimes writtenNote that the coefficients involved in 7fy 
give the elements of the /th column of A. 

Conversely, if A - (a^) E M mn ( K ), define T A : VW by 

n 

T a Vj = 2 a ij w i 0 = 1, • • • ,n) 

i = 1 

then by Theorem 4.2, T A is the unique linear transformation with this 
property. Thus, there is a one-one correspondence between the set of 
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linear transformations from V into W and the set of all m x n matrices 
over K. 

If T is a linear transformation on V, we denote by (T)$. 

We consider in particular a linear transformation Tfrom V n (K) into 
V m (K), If (x lt x 2i ... ,x n ) G V n (K), then put T(x lt x 2 , . .. , x n ) 
~(y\>y 2 , • • • V m (K). If £?= {e lf e 2 ,. .., e n } and H r 

= ( <?i , ^ 2 , • • . , e’ m } are the standard K -bases for V n {K) and V m (K) 
respectively md^(T) ^ = (a /y ), then for /=1 ,..., n 

m 

T ( e j) = 2 % e] 

1=1 

~ ( a lj> a 2j > • • • 5 a m j) 

Thus, we have 

(y \ > y 2 > • • • > y m) = t(x i,x 2 ,...,x„) 

= 2 XjT(e f ) 

7 = 1 
n 

or in other words 

n 

y t = 2%*,- (i=i,...,/«) 

/=i 

This means that if 7 1 :1'^ (A') -»• iK) is a linear transformation and 
A = (a,y) £ M m n iK) is the matrix of T relative to the standard /f-bases 
for V n iK) and V m (AT) then 

Tix u x 2 ,...,x n ) = ( 2 a u x., 2 ^x 2 oc mj x\ (1) 

\/ = l 7 = 1 7 = 1 7 

Conversely, if T: V n iK) -*■ V m iK) is defined by (1), then it is easily 
verified that T is a linear transformation. Hence every linear transform¬ 
ation from V n iK) into V m iK) must be of this form. Note that if the 
linear transformation T is presented in this form, the /th component of 
Tix i,, x n ) gives the elements in the /th row of the matrix of T 
relative to the standard Af-basis for V n (K) and V m iK) 
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EXAMPLES 

1. Let T : F 3 (R) -> V 2 (R) be defined by T( a, 0, 7 ) = (a 4- 0 - 7 , 2 a + 7 ) 
for all (a, 0, 7 ) E L 3 (R). Then, by the above, Tis a linear transformation 
and the matrix of T relative to the standard R-bases for P^fR) and 
V 2 (R) is 


1 1 
2 0 


1 


Now, let 3 = {Vi = (1,0, —1), z > 2 = (1, 1, l),z > 3 = (1,0,0)} and 
i/ = {w x — (1, 1), w 2 = (1,0)} then it is easily verified that 3? and 
H r are linearly independent over R and are R-bases for L 3 (R) and 
F 2 (R) respectively. We now determine the matrix of T relative to 3 f 
and H r . We have 

Tv t = T( 1,0,—1) = (2,1) = W!+w 2 

Tv 2 = r(l, 1 , 1 ) = ( 1 , 3 ) = 3 w x -2w 2 

Tv 3 = r(0, 0, 1 ) = ( 1 , 2 ) = 2w x —w 2 


and so 





2. Let R : K 2 (R) -> K 2 (R) be the reflection on the line L with angle 0 
through the origin. We find the matrix of R relative to the standard 
R-bases for F 2 (R). 

To determine this matrix, we must find R (1,0) and R (0, 1). From 
Figure 4, we see that 

i?(l, 0 ) = (cos 20 , sin 20 ) 

,R( 0 , 1 ) = (sin 20 , —cos 20 ) 

and the required matrix is 

cos 20 sin 20 

sin 20 —cos 20 , 
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3. Let R be a rotation through an angle 6 about the origin in V 2 (R). 



In this case 

/?( 1 , 0 ) = (cos 6, sin 9) 

R(0, 1) = (—sin 6 , cos 6) 

and so, the matrix of R is f C ° S f S * n 

\sin 0 cos 6) 

4.3 Change of Basis 

In the first example on page 98, we saw that the matrix of a linear trans¬ 
formation is dependent on the choice of basis. We now investigate more 
closely the effect of a change of A'-basis on the matrix of a linear trans¬ 
formation. 
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Before proceeding to the main result we prove the following: 

LEMMA 4.3 Let {z>i, v 2 ,, vbe a K-basis for Vand 

n 

v• = 2 (J = 1,.. . , n), then { v[, v 2 ,. .., v' n } is linearly 

i=i 

independent if and only if A = (a /y ) £M n (K) is an invertible matrix. 
PROOF Consider 

s P f v;=o 

7 = 1 

where P- E K (j = 1, 2,..., n). Then 

h ') * 0 

i.e. 

2(2 «*/ 0 /) = 0 
1 = 1>/=1 ' 

But {^i, ^ 2 ? • • • » v n } is linearly independent over K , and so 

n 

2 a.. p. = 0(i= l,... ,n). This is a system of 77 linear equations in n 
7 = 1 

variables which has a non-trivial solution if and only if A is a singular ■ 
matrix, which implies the lemma. ■ 

COROLLARY A matrix A is invertible if and only if its columns 
(rows) are linearly independent over K. 

Let T : V W be a linear transformation where (V: K) = 72 , 

(W\K) = m. Let & = {v,, % ..., v n } and H" = jw,, w 2 ,..., w m } be 
Af-bases for F and W respectively and let s (T)^. =A = (a ij )EM m n (K) be 

the matrix of T relative to 38 and if r , i.e. 

m 

TVj = 2 %• w. (/=!.•••»«) 

1=1 

Now, let*#' = {^i, x> 2 , •. •, v* n } and {wj, w 2 ,..., also be 
Abases for V and IV respectively and let ^ =B = (p if )GM m n (K ) be 

the matrix of T relative to 38' and ^ 7i.e. 

m 

Tv! = 2 fi t/W ; 0=1.2 

7 = 1 
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Now, ^ and.#' are AT-bases for F and by Theorem 4.2 there exists a 
unique linear transformation P on V such that PVj =v'(J= 1,2,...,«). 
Similarly, there exists a unique linear transformation Q on W such that 
Gw ; = w/(/ =1,2,..., m). Let C = (y if )eM n (K) and 
D = (5 z y) EM m (K) be the matrices of P and Q relative to the A'-basis 
and H ^respectively i.e. 

v! = Pv j = £ y ij v i (/= 1,2,..., 71 ) 

7=1 

and 

m 

wj = QWj = 2 5 ,y w i 0 - 1.2,.... m) 

7 = 1 

Then, for 7 = 1,2,...,«, we have 


m 

tv; = 2 % w; 


7=1 


m / m 

2 % 2 «*, 

7 = 1 U =1 


m / m \ 

= fc 2 I 2 «*, % J 

and alternatively 

= T ( £ 7,y 

2 % ) 

fc=i J 

( °5m 7,/j 

Thus, we have two alternative forms for the matrix of T relative to the 
K -bases .#'and H r , namely 

y(7)^ =DB=AC 

By Lemma 4.3 proved above, we have that C and D are invertible 
matrices, thus 



B = D~ l AC 
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Thus, we have proved the following: 

THEOREM 4.4 Let Tbe a linear transformation from a K-space V into a 
K-space W. If S and are K-bases for Vand H and H ' are K-bases for 
W, then 

~(Q)yr 

where P is the unique linear transformation which maps S onto S’and Q 
is the unique linear transformation which maps if onto if"'. 

This theorem is illustrated by applying it to the first example considered 
on page 98. Using the notation of that example, then 



which conforms with the result obtained there. 

We now consider the particular case when T is a linear transformation 
on V in more detail. In this case, we let the A'-bases J^and if' coincide 
and the AT-bases S' and H r ' coincide. This means now that D = C and thus 

(*V = C ' 1 (T)^ C 

We give a special name for matrices connected in this way. 

DEFINITION 4.5 If A, B EM n (AT), then we say that B is similar to A 

if there exists an invertible matrix C£M n (AT) such that 

B = C 1 AC 


In fact, what we have proved above is that if A and B represent T 
relative to certain A’-bases for V, then ,4 and A? are similar. The converse 
of this can also be proved. We have 

THEOREM 4.6 Let A, B E M n {K\ then A and B are similar if and 
only if they represent the same linear transformation on a vector space 
V of dimension n relative to suitably chosen K-bases for V. 
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PROOF If A and B are similar matrices, then by Definition 4.5 there 
exists an invertible matrix C€M n (K) such that# = C~ X AC. Let V be a 
A%space of dimension n with AT-basis {v u v 2 ,, v n } and let Tbe the ' 
linear transformation T on V defined by 

TVj = 2 %• v i 0 “ = 1 .«) 

i=i 

For / = 1 ,2. n, let 

v i = 2 la v i 

i = 1 

where C = and C " 1 = ( 7 /). Then by Lemma 4.3 since C is 
invertible, {z>/, v 2 r , ..., ^„ }is also a/f-basis for V. Furthermore, for 
/= 1 , 2 ,... 

Tv; = r( jU/”,) 


" / 
2 y„ ( 

i=l 

\k 

n 

/ n 

2 

( 2 

k = 1 

\i= 1 

n 

t n 

2 < 

2 < 

k- 1 

U =1 

n 

f n 


2 

8 = 1 ‘ 

<fr=l 


Thus, the matrix of T relative to the X'-basis { vf , v 2 \ . .. , v„} for V is 
C' 1 AC = B. This completes the proof since the proof of the converse . 
has been given before the statement of Theorem 4.6. 

EXAMPLE Let S— { u , v, w} be a X'-basis for a 3-dimensional 
vector space V and let 
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be the matrix of a linear transformation on V relative to this A^-basis. 
Find the matrix of T relative to the K -basis J?' = {u + v, u — 2zj 4 w, 
v — w}. 

By the above theorem, the matrix 



and by the methods of § 1 . 6 , we can invert this matrix to give 

a i i\ 

c -* = I 1 -1 -1 

\l -1 -3/ 

Thus, by Theorem 4.6, the matrix of T relative to the K -basis is 



Exercises 4.3 

1. Let Tbe a linear transformation on F 3 (R) defined by 

T(x,y, z) = (x—y 9 x + 2y—z 9 2x+y + z) 

Find the matrix of T relative to (i) the standard basis for K 3 (R); 

(ii) the R-basis {v u v 2 , v 3 } for F 3 (R), where v x = (1,0, 1), 

V 2 = (~2, 1, 1), z> 3 = (1, —1, 1). 

2. The matrix of a linear transformation T on F 3 (R) relative to the 
standard basis is 



1 

0 

1 
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Find the matrix of T relative to the R-basis {v u v 2 , v 3 } where 

Vi = ( 1 , 0 , 1 ), z ; 2 = (- 2 , 1 , 1 ), z > 3 = ( 1 , - 1 , 1 ). 

3. Find the matrix of the linear transformations T on P n (R) defined 
by (a) T(f(x )) = f (x) (b) T(f(x)) = f (x+1), relative to the R-basis 

(i) {l,x,x 2 ,... ,x n }, (ii) { 1 , jc — 1 , (x — l ) 2 .(jc — 1 )”}, 

(iii) { 1 , 1 +x, 1 + x +x 2 ,..., 1 + x + . . . +x"}. 

4. If S is a fixed matrix in^/ 2 (R), find the matrix of each of the 
following linear transformations T on Af 2 (R) relative to the standard 
R-basis {e if \i,j= 1,2)} 

(i) TA = S A, (ii) TA =AS, (iii) TA = SA -AS. 

5. If {wj, u 2 } and {z^i, v 2 , v 3 } are R-bases for K 2 (R) and Vs(R) 
respectively and if a linear transformation Tfrom K 2 (R) into F 3 (R) is 
defined by 


Tui = v x + 2v 2 — v 3 
Tu 2 = v l — v 2 


find the matrix of T relative to these bases. Find also the matrix of T 
relative to the R-bases {—u x 4- u 2 , 2u x —u 2 ) and 
{ v u v \ + ^ 2 , v \ + v i + ^ 3 } for Fi(R) and F 3 (R) respectively. What is 
the relationship between these two matrices? 


6 . If U and V are K -spaces of dimension 3 and {u u u 2 , w 3 }, 


1 1 i\ 

{^ 1 , v 2 y v 3 } and^(T)^ = 1 X ju , findwhere 

\l X 2 n 2 ) 


J?' = { + u 2 4- u 3 , u 2 + (X 4- 1 ) u 3 , u 3 } 

Hence or otherwise, find the values of X and (x for which the system of 
linear equations 


X 4- y 4- z = 1 
x + Xy + fxz = 2 
x 4 X 2 y 4- ii 2 z = 4 

has (i) a unique solution (ii) more than one solution. 

7. Find an R-basis for the vector space of all homogeneous real 
quadratic polynomials in three indeterminates x,y and z. Show that 
the mapping which takes such a polynomial f(x,y, z) into 
/(ox 4 y 4 z, py 4 yz, 0) is a linear transformation and find its matrix 
relative to this R-basis. 
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8. Let U be the vector space of all real quadratic polynomials in two 
variables x andy and V the vector space of all real cubic polynomials in 
one variable x. Find R-bases for U and V. Define T\U -► V by 
T(f(x,y)) = f(x,2). Show that Tis a linear transformation and fmd 
the matrix of T relative to the R-bases for U and V. 

4.4 The Kernel and Image of a Linear Transformation 

Let V and W be X-spaces, and let The a linear transformation of V into 
W. We now introduce certain subspaces of V and W which are important 
in later applications. Put 

ker T = {vEV\Tv = 0} 
and 

imT = {Tip) \ve V} 

Then the following lemma is proved. 

LEMMA 4.7 (i) ker T is a subspace of V , 

(ii) im T is a subspace of W. 

PROOF (i) ker T is non-empty since 0 E ker T. If v 9 v E ker T and 
a E Af, then T(v) = T(v) = 0 and T(av -I- v) = ocT(v) 4- T(v) — 0, 
that is ocv 4- v E ker T and ker T is a subspace of V. 

(ii) im T is non-empty since T(0) E im T. If w, w' E im T and a E Af, 
then w = T(v) and w' = T(y’) for some v, v E V, and thus 

aw + w' = a T(y) + T(v') 

= T(av + v) 

where ocv 4- v E V, i.e. aw + w ; E im T and im T is a subspace of W. I 
If V and W are finite dimensional vector spaces over K , then ker T 
and im T are also finite dimensional over K and we can give the 
following 

DEFINITION 4.8 nullity T = (ker T: K), rank T = (im T: K). 

The rank and nullity of a linear transformation are connected as 
follows. 

THEOREM 4.9 If V and W are finite dimensional vector spaces over 
K and T a linear transformation of V into W then 

nullity T 4- rank T = (V:K) 

PROOF Suppose that nullity T=r and let {v u v 2 ,. . • , v r ) be a 
AT-basis for ker T. By Theorem 3.17, this K -basis for ker T can be 
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extended to give a Af-basis {v u ..., v r , v r + v ..., v n ) for V, 

if n = (V: K). By definition of im T, it is clear that {Tv u Tv 2 ,..., Tv n } 

generates im T over K. But Tv x = Tv 2 = .. . = Tv r = 0, since 

i>i, v 2 ,. .. , v r E ker T, which implies that {Tv r + V , Tv n } generates 

im T over K. We show that this set is also linearly independent over K 

and hence is a K -basis for im T and the result will follow, i.e. 

r + (n — r) = n. 

Consider 

a r + l T ( V r + l) + ■ ■ • + a «7’0„) = 0 

then 

T ( a r + \ V r + l + ■ ■ ■ + % v n) = 0 

and a r + l v r + 1 + ... + a n v n G ker T. But {v u v 2 ,..., v r } is a A^-basis 
for ker T and thus 

«r + l V r + l + ■■■ +a n V n = Pi V l + ■ ■ ■ + V r 

for p l9 (3 2 ,. . . , E K, or in other words 

PiVi + ...+P r v r -a r+l v r + 1 -...-a n v n = 0 

Since {v u .. ., v n } is a K -basis for V and is linearly independent over K, 
we have in particular that a r + 1 = . . . = a„ = 0 and { T(v r + 1 ) 9 . . . , 
T(v n )} is linearly independent over K. 

EXAMPLES 

1. Let T : V 3 (R) -> V 3 (R) be defined by 

T(oc i, a 2 , a 3 ) = (a! + 2a 2 - a 3 , 2a! 4- a 3 , a! - 2a 2 4- 2a 3 ) 

Then T is a linear transformation on L 3 (R). We now determine R-bases 
for ker T and im T. We have that 

im T = { T(ot u a 2 , a 3 ) = a! (1,2, 1) 4- a 2 (2, 0, -2) 

4-a 3 (—1, 1,2) | a l5 a 2 , a 3 E R} 

Thus {(1,2, 1), (2, 0, —2), (—1, 1,2)} generates im T over R. 

Since 


(-1,1,2) - Hi, 2, 1) —1(2, 0, —2) 

and {(1,2, 1), (2, 0, —2)} is clearly linearly independent over R it 
follows that {(1,2, 1), (2, 0, —2)} is an R-basis for im T and rank T= 2. 
By the above theorem, we have that nullity T = (L 3 (R) : R) — 2 = 1. 
Now, (a 1? a 2 , a 3 ) E ker T if and only if 
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ai + 2a 2 ~ <^ 3 = 0 


2(*i 


+ a 3 = 0 


— 7ql 2 4 - 2a 3 = 0 


From 


2 _1 \ /* 2 ' 

0 1 J-M 0 —4 

1 -2 2/ \0 -4 

we deduce that this system reduces to 

<*1 = ot 2 = §a 3 

or, in other words 

ker T = {ot 3 (~i f, 1) | a 3 e R} 

and {(—2, 3,4)} is an R-basis for ker T. 

2. Let T : (^(R) -> C(R) be defined by 





yec 2 ( R) 


Then it is easily verified that Tis a linear transformation. 
Now 

ker T= {y E C 2 (R) | 7X» = 0> 


{yGC 2 (R) 


d 2 _y_ 

dor 2 


+y=0} 


Thus, ker T is the solution space of the differential equation 
d 2 y 


dx : 


+y =0 


which was completely determined in Example 8, p. 81 
3. Let T: F 4 (R) -*■ F 2 (R) be defined by 

T(x,y, z, w)=AX 

where A = ^ = (.xy,z,w), then 

ker T= { (x,y,z,w)\AX = 0) 
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Thus, ker T is the solution space of the system of linear equations 
determined in Example 7, p. 80. 

The application of the ideas introduced in this section and illustrated 
by this last example to the solution of linear equations are developed in 
the final section of this chapter. For a discussion on the corresponding 
problem for linear differential equations see D.L. Kneider, R.G. Kuller, 
D.R. Ostberg and F.W. Perkins (loc. tit.). 

Exercises 4.4 

1. Let T: K 4 (R) -+ F 3 (R) be the linear transformation defined by 

T(a t , oij, a 3 , (X4) = (oq + a 3 + o^, dj + 2a 2 — a 3 + a 4 , 

3a 2 — 2a 3 ) 

Find R-bases for ker Land im T. 

2. Find the rank and nullity of the linear transformation from F 4 (R) 
into F 3 (R) whose matrix relative to standard bases is 

(' 2 - 2 \ 

2 6 3 -3 

\0 2 5 -7/ 

3. Find ker T and im T for all the linear transformations defined in 
Exercises 4.1, No. 1. 

4. Find the rank and nullity of a linear transformation T from F 4 (R) 
into F 3 (R) defined by 

7’(o!j, ol 2 , a 3 , oq) = (oil o: 3 + 2oq, 2o;j + a 2 + 2cr 3 , o 2 + 4oq) 

Show that (1, 3, k) is in im T if and only if k = 5. 

Find the condition for (l,x, 1 ,y) to be in ker T. 

5. Let F denote the real vector space of polynomials f(x,y) with real 
coefficients of degree not exceeding n in two variables x and y. Show 
that the mappings S and T defined by 

are linear transformations on F. Find the kernel and image of 5 and T. 

Find an R-basis for F and find the matrices of S and T relative to this 
R-basis. 
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6. Let V be the vector space of real functions which have derivatives 
of all orders. If D is the derivative, find 

(i) ker D, (ii) ker D n (n> 1) and (iii) ker (D — I). 

7. If S = ^ j 1)» find R-bases for ker T and im T for all the linear 

transformations defined in Exercises 4.2, No. 4. 

8. If T is a linear transformation of a vector space V into a vector 
space W, show that the elements of V which are mapped into a given 
subspace U of If, form a subspace X of V. If the dimensions of V, If, 

U, X are m, n, p, q respectively and if the rank of T is n, find a relation 
between m, n, p, q. 

9. Let V be an ^-dimensional A-space and S and T are linear trans¬ 
formations on V, prove that 

nullity (ST) < nullity S 4- nullity T 

If S n = 0, but S n ~ x =£ 0, determine nullity S. 

10. If S :U~* V and T : V-> U are linear transformations, prove that 

rank T — rank ST < (U : K) — rank S 

11. Find a linear transformation T on some vector space V such that 
ker T= im T. Can this be done for all vector spaces? 

12. If V is a A-space, prove that 

im r H ker T — {0} if and only if T(Tv) = 0 implies Tv = 0, where 
ve v. 

13. If T is a linear transformation on a finite dimensional A-space V 
and rank T 2 = rank T , then im T n ker T = { 0}. 

14. If T is a linear transformation on V such that T 2 = T, prove that 

(i) ker T = im(/ — T), ker (I —T) = \mT 

(ii) ker T n im T = 0 

(iii) every vE V can be uniquely expressed in the form = ^ + v 2 , 
where v x E ker T, v 2 E im T. 

4.5 AT-ismorphisms and Non-singular Linear Transformations 

We now connect these ideas with other important concepts in algebra, 
namely A'-isomorphisms and non-singular or invertible linear trans¬ 
formations. 
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DEFINITION 4.10 A linear transformation (K-homomorphism) 

T: V-*W is called 

(i) a K-isomorphism if T is a bijective mapping , i.e. T is injective 
(T(v) = T(v) implies v = v) and T is surjective (ifw E W, there exists 
a v E V such that Tv = w). 

(ii) a non-singular transformation if ker T= {0 }. 

Let T: V-> W and S : W-+ Ube A-isomorphisms, where V, W, U are 
A-spaces, then ST:V-+U defined by (ST) (v) = S(Tv) for all v E V is 
bijective, and is also a linear transformation since if a E A, v, v E V 
then 


(ST)(av + v') = S(T(av 4- v')) 

= S(aT(v) + T(v')) 

= oS(T(v))+S(T(v’)) 
= aST(v) + ST(v) 


If T is a A-isomorphism of V onto If, then since T is a bijective 
mapping T 1 : W -+ V is also a bijective mapping and is a linear 
transformation from W onto V since if w 2 EW and aEA, then 
Wj = Tv u w 2 = Tv 2 for unique v u v 2 EV and 

T~ 1 (ccw l + w 2 ) = T~ l (aTv x 4- Tv 2 ) 

= T' 1 T(ptVi 4- v 2 ) 


= otVi 4- v 2 
= clT~ l w x 4- T~ 1 w 2 


Thus, T is an invertible linear transformation. Conversely, if 7" is an 
invertible linear transformation, then T is a A-isomorphism and thus 
these two concepts are equivalent. 


We now show that if the two A-spaces V and W are finite dimensional 
and have the same dimension then the concepts of invertible and non¬ 
singular linear transformations are equivalent. In fact, we show that 
these are equivalent to various other statements. 

THEOREM 4.11 Let V and W be finite dimensional K-spaces with 
(V . K) — (W : K) ~ n and Ta linear transformation of V into W. Then , 
the following statements are equivalent 

(i) T is a K-isomorphism 

(ii) T is invertible 

(iii) T is injective 
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(iv) T is non-singular 

(v) rank T = n 

(vi) T is surjective 

(vii) if {vv 2 ,.. ., v n } is a K-basis for V, then 
{T(v x \ T(v 2 ),. . . , T(v n )} is a K-basis for W. 

PROOF In the proof we show that 

(i) => (ii) => (iii) => (iv) => (v) =► (vi) => (vii) => (i), 

which will imply that all seven statements are equivalent. 

(i) => (ii) has been proved above. 

(ii) => (iii) is a well-known for maps in general. 

(iii) => (iv). Let vE ker T , then T(v) = 0 = r(0) and as T is injective 
v = 0 and ker T — (0 } and T is non-singular. 

(iv) => (v). If T is non-singular ker T = { 0} and nullity T = 0. By 
Theorem 4.9 we have rank T = (V :K) = n. 

(v) => (vi). By Lemma 4.7, im T is a subspace of W. If 
rank T = n = (W :K) then im T = W and T is surjective. 

(vi) => (vii). If T is surjective, then if w E W, there exists avEV such 
that Tv = w. If { v u v 2 ,... ,v n } is a Af-basis for V then 

n n 

v= 2 a- v t , a i E K (i = 1,.. ., n) and w = Tv = 2 a. T(v { ), 

i = 1 / = 1 

i.e. {T(Vi ),. . . , T(v n )} generates W. By Corollary 1 to Theorem 3.17 
this must be a Af-basis for W. 

(vii) =► (i). Let v, v E V such that T(v) = T{v'). We have 

n n 

v - 2a- 1 ?., v = 2 j3. v { , a • ,^EK (i - 1,...,«) and thus 

i=i i=i 

2 (a. —p.) T(v-) = 0. 

i=i 

But {T(v x \ . . . , T(v n )} is a A'-basis for W and so a f = @. (/ = 1,. . . , n ), 

n 

i.e. v = v and T is injective. Let wEW, then w = 2 a. T(^-), or 

i = i 

/ n \ » 

T 2 a • z^.j = w with 2 oq E K and T is surjective. Thus T is a 
v=i / /=i B 

A^-isomorphism. 

In particular, we note that if T is a linear transformation on a finite 
dimensional vector space V then the concepts of AT-isomorphism, 
invertible and non-singular are equivalent, and furthermore, to prove 
that a linear transformation is invertible we need only verify that it is 
either injective or surjective. 
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We now prove two useful results concerning non-singular linear 
transformations. 

THEOREM 4.12 If V is a finite dimensional K-space and S, T are 
linear transformations on V where T is non-singular , then 

rank ( TS ) = rank S = rank (ST) 

PROOF If z>E im^jT) then v = (ST)(v) = S(T(v’)) for some v E V, 
i.e. z^EimS and im (ST) C i m s. Conversely, if z>E imS, then v = S(v') 
for some v E V. But T is non-singular and so in particular is surjective 
and so there exists v" E V such that T(v") = v\ i.e. v = ST(v "), 
vEim(ST) and im S = im ST and rank S = rank ST. 

Now if vE ker S then S(v) = 0 and so TS(v) = 0, i.e. vE ker TS and 
ker S Q ker TS. If vE ker TS then (TS)(v) = 0, but T is non-singular 
and so S(v) = 0 and v E ker S , i.e. ker S = ker TS. 

Hence, we have nullity (TS) = nullity (S) and by Theorem 4.9, it 
follows that rank TS = rank S as required. 

THEOREM 4.13 If S and T are linear transformations on a finite 
dimensional K-space then ST is non-singular if and only ifS and T are 
non-singular. If ST is non-singular , then (ST)~ l = T~ l S~ l . 

PROOF If ST is non-singular then ker ST = {0}. 

But ker T Q ker ST , thus ker T = {0} and T is non-singular. By 
Theorem 4.12 it follows that 

rank (ST) = ranks = (V:K) 

and by Theorem 4.11 S is also non-singular. 

If 5 and T are non-singular then by Theorem 4.12 

rank (ST) = rank 5 = (V :K) 

and by Theorem 4.11, ST is non-singular. 

If ST is non-singular, then ST has a unique inverse ( ST)~ l , i.e. 

(ST) (ST)' 1 = I v . But, in addition, we have 

(srur-'s- 1 ) = s(tt~ 1 )s~ 1 = i v 

and thus (STf 1 = T' 1 S~\ I 

The concept of A'-isomorphism is of sufficient importance to 
deserve further attention. 

If T: V W is a A^-isomorphism, then we have seen above that 
T' 1 : W -* V is also a AMsomorphism. Indeed A'-isomorphy is an 
equivalence relation on the set of all AT-spaces. Clearly I v : V -> V is a 
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^-isomorphism and if T : V W and S : W U are ^-isomorphisms 
then ST : V -> U is also a A-isomorphism. In this case, we say that V and 
W are A-isomorphic and we denote this by V=W. A-spaces which are 
A-isomorphic are regarded as being “equal” or “the same”, although 
they may contain different elements and the operations of addition 
and scalar multiplication are distinct. The reader may have noticed for 
example the similarity of the three examples (i) K 4 (R), (ii) -P3 (R), 
the R-space of real polynomials of degree < 3, (iii)M 2 (R), and that in 
practice they are dealt with in the same way. The isomorphism is 
easily established, for 

T : P 3 (R) -* V 4 (R) given by 

T(a 0 + oliX 4- cl 2 x 2 4- a 3 x 3 ) = (oq, ol u ol 2 , <* 3 ) 

is a R-isomorphism and 

S : M 2 (R) V 4 (R) given by 



is also a R-isomorphism. We shall not deal with these examples in 
detail because the following more general theorem can be proved. 

THEOREM 4.14 Let V be a finite dimensional K-space of dimension 
n, then V = V n (K). 

PROOF If (V:K) = n, let {v u v 2 ,. .. , v n } be a A-basis for V. 

IfvEV then v= a x v x 4- ... 4- a n v n , where oq.E K(i = 1,..., n) are 
uniquely determined. Then T : V -> V n (A) defined by 

T(v) = T(ol x v x + ... + oc n v n ) = (oq, OEa, ..., <*„) 

is a well-defined map which is easily shown to be a linear transformation 
over A. T is clearly a surjective map and since (V : A) = (V n (K) : A) = n,. 
it follows from Theorem 4.11 that A is a A-isomorphism. 

This theorem is a strong result - if our goal had been to classify all 
finite dimensional vector spaces over a field A then it tells us that every 
finite dimensional vector space over A is essentially a V n (K) for some 
positive integer n. This suggests that rather than deal with an “abstract” 
A-space it is only necessary to consider the “more concrete” V n (A). 
However, in practice, it turns out that there are sometimes advantages 
in working at the more abstract level — for example, simple and 
elegant proofs may be possible which may not be immediately apparent 
at the more concrete level. The detailed and sometimes cumbersome 
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explicit information available when more concrete vector spaces 
comprising vectors, matrices or polynomials are considered may blind 
us from appreciating what the bare essentials to carry through a certain 
proof may be. But, as is typical in mathematics, developments are 
usually made by exploiting the interplay between the abstract and 
concrete - when no more progress is possible at the abstract level, 
something may be done by considering an isomorphic concrete 
example or vice versa. 

It could be argued that a more elegant and natural introduction to 
linear algebra would be to first consider abstract vector spaces and their 
linear transformations — develop their theory as far as possible and then 
introduce matrices and the applications to linear equations. In this way, 
some of the more cumbersome and tedious proofs could be eliminated 
and the advantages of working at the abstract level would become more 
apparent. However, the philosophy of our approach is that there are more 
benefits to be gained, especially to newcomers to the subject, by first 
working at the more concrete and familiar level and use this as a firm 
foundation from which the new abstract concepts can be introduced. 

In any case, for explicit computations the work in the earlier chapters 
will necessarily eventually have to be covered. 

The same is also true relative to linear transformations and matrices. 

We have seen in §4.2, that every linear transformation can be 
represented by a matrix, but whereas the proof of the associativity of 
multiplication of linear transformations is trivial, the corresponding 
result for multiplication of matrices, although elementary, is cumbersome. 

A further important isomorphism of A-spaces is given as follows: 

Let JSf(F, W) denote the set of all linear transformations of a 
A-space V into a A-space W. If S, TE J^(F, W) their sum S + T is 
defined by 

(S + T)(v) = S(v) + T(v) for all vE V 

If v, v E V, a E A, then 

(S + T)(av + v') = S(av 4- v f ) 4- T(av + v f ) 

= otS(v) 4- S(v') 4- olT(v) 4- T(v f ) 

= a(S + T)(v) + (S + T)(v) 

i.e. S + TE W). 

Similarly, if a E A, S E J*f(V 9 W), define 

(aS)(v) = aS(v) for all vEV 

then olS E Sf(V 9 W). 
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With these definitions of addition and scalar multiplication it can be 
proved that 

LEMMA 4.15 Sf{V 9 W) is a K-space. 

PROOF Exercise. 

If V and W are finite dimensional vector spaces, then the following 
important A'-isomorphism can be established. 

THEOREM 4.16 If V and W are finite dimensional K-spaces , where 
(V: K) = n, (W: K) = m then 

S(V,W) = M m JK) 

PROOF If re S(V, HO and {v u .. .,v n },S' = 

are K -bases for V, W respectively, let ^ (T )^» = (a ( y) be the matrix of 

T relative to S'dnA S'. Define 0: S(V, W) M m n (if) by 

\f 0 L£K,S,T^S(V,W) and jg(S) a > = 0%) then 
0 (<xT + S) = {aa.^ + j3 f/ .) = a<p(T) + 4>(S) 

since 

m 

(aT + S ) Vj = 2 (««,-/ + %) w i O' = 1»• • •. h) 

1=1 

i.e. 0 is a linear transformation. 

0 is surjective, since by §4.2 we see that if ^4 e Af m „ (K) then there 
exists a T A € _£^(F, IP) such that 0 (7^) = /l. Furthermore, if 
5, re J^(F, HO and 0(5) = 0(7), i.e. (a, y ) = (%) then a {j = % 

(/ = 1,..., m; j = 1,..., n) and thus for; = 1,.... n 

m m 

TVj = 2 OL t j W,. = 2 % W,. = Sv j 

1=1 1=1 

and so 5 = 7, i.e. 0 is injective and we have the required . 

^-isomorphism. ■ 

COROLLARY If V and W are finite dimensional K-spaces where 
(1 V:K) = n, (W:K) = m then (£f(V, W):K) = mn. 

We have seen that in both M n (K) and J*f(V 9 V)= &(V) in addition 
to the two operations of addition and scalar multiplication, the 
operation of “multiplication” is also possible. It is of interest to see 
whether this operation is also preserved under the mapping 0; to be 
more precise, if 5, 7E &(V) 9 is 0(57") = 0(5) 0(7) or 
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(£7) eg - „ (T) ? This is easily verified to be the case for if 

S= {v u ..., v n } and 


n 


TVj = 2 

CC-V’ 

IJ l 


»«) 

i=i 



n 




Sv f = 2 

B--V- 

K ij i 

(j= 1,.. • 

.«) 

1=1 





n \ 


(ST)(v f ) = 

s (, 

!= 1 / 



n 

/ n \ 



2 

i=i 




n 

/ n v 



2 

( 2 



*=i 

\i= i 7 



and the (ft,/)-element of the matrix (ST)# is 2 (3 ki cl.. 9 which is the 

i=i 

(fc,/)-element of the matrix (S)# (T)#. 

The reader may have wondered when the matrix of a linear 
transformation was first introduced why one did not simply define the 
more natural 

Tv t = 2 a v (i=l,...,ri) 

7 = 1 

where the ordering of the /,/' is preserved, i.e. the matrix is the 
transpose of the one defined in §4.2. If this had been done, then we 
would now have obtained (ST)# = (T)# (S)# or 0(57) = 0(7*) 0(5). 

(The ideas developed above can be expounded more succinctly as 
follows: A set jz /together with operations of addition, multiplication 
and scalar multiplication by elements of a field K is called a linear 
A-algebra if 

(i) s/is a 7f-space 

(ii) if a, b E j^then ab E j Y 

(iii) (ab)c = a(bc) 9 a(b + c) = ab 4- ac 9 (a 4- b)c = ac + be for all 

(iv) cc(ab) = (oca) b = a(ab) for all a, b E s/, otGK. 

If and s/' are linear TT-algebras then j^and J^'are algebra- 
isomorphic if there exists a bijective mapping 0 : j^such that 
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(i) (p(aa + a') = ot<p(a) + 
for all a E K, a, a' E s/ 

(ii) <p(ab) = <p(a) <p(b) 
for all a, b E sf. 

Then M n (. K ) and &(V) are linear A'-algebras and if (V : A”) = then 
M n (K) and ^(F) are not only A'-isomorphic but algebra-isomorphic.) 

Exercises 4.5 

1. If {z? l5 v 2 , z> 3 ,z> 4 } is an R-basis for the vector space V, for what value 
of X is the linear transformation T defined by 

Tv 1 = Vi 4- Xz> 4 

Tv { = 2z> / _ 1 + z> / (z =2,3,4) 
non-singular? 

2. If T is the linear transformation on F 3 (R) defined by 

T( oq, a 2 , 0:3) = (3oci — a 2 , oq — oc 2 + 0:3, “<*1 + 2a 2 — ^3) 

show that T is non-singular. Give a rule for T~ l like the one which 
defines T. 

3. A linear transformation on C regarded as R-space is defined by 
T(z ) = (1 — i)z for all zGC, show that T is non-signular. 

4. Prove that the differentiation transformation/) onP^(R) is singular. 
What does this imply for the kernel of /)? 

5. If T is a linear transformation on V such that T 2 = 0, show that I — T 
is non-singular. 

4.6 Applications to Linear Equations and the Rank of Matrices 

Let A = (Of .) E M m n ( K ) and let c ; - (7 = 1,..., w) be the w columns oL4, 
i.e. Cj = (a 1/5 ot 2j -,, a mj ). Then {c u c 2 ,... ,c n } generates a subspace 
C A o{V m (K). 

DEFINITION 4.17 If A G M m n (K) the column rank of A is defined 
to be (C A : K), or in other words, it is the maximum number of linearly 
independent column vectors in A. 


118 


EXAMPLES 


1. Let 
A 



-1 2 1 \ 

0 1 1 

-13 2 

2 -3 -1 / 


If c i> C 2i e 3 and C4 are the columns of A and we consider 


+ ol 2 c 2 4- ot 3 c 3 4- 04^4 = 0 


then we have a system of four linear homogeneous equations in the 
variables oc u a 2 , a 3 , 04 with matrix of coefficients ,4. It is easily shown 
that A is row equivalent to 

/ 1 0 h \ 

0 1 § — \ 

0 0 0 0 

\0 0 0 0 

and thus 

<*1 = “50:3-10:4 



ol 2 = § a 3 4- 2 04 


from which, by putting a 3 = 1, a* = 0 and a 3 = 0, a 4 = 1, we obtain 
c 3 = h(c\~ 3c 2 ), c 4 = i(c 1 — c 2 ) 


respectively. {c u c 2 } is linearly independent over R and the column 
rank of^4 is 2. 


2. Find the column rank of the matrix 


2 a + b 

3a 5-3b 

-a ~2b 

2a 4- b ^ 

3b 

3b 

—a+b 

a 4- 5b 

3 a 

4a + 2b 

-3b 

3 a 

b —a 

~2a — b 

3b 

3b / 


for all values of a and b. 

From Theorem 3.13 we see that the columns of,4 are linearly 
independent if and only if detA 0. 
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Now 


-a-2b 


3 a 4- 3b 

-a-2b 

2 a + b 

0 


3b 

—a + b 

a + 5b 

—a — 2b 


4a + 2b 

-3b 

3a 

a + 2b 


—2a — b 

3b 

3b 


0 

a + 2b 

—a 4- b 

2 a 4- 4 b 

(a + 2b) 

0 

3b 

—a 4- b 

a 4- 52? 

0 

2a + b 

0 

3 a 4- 3b 


1 

—2a — b 

3b 

3b 


= —{a + 2b) (b — a) 


= (a 4- 2Z?) (a — 2?)' 


= (a + 2b) 2 (a-b) : 


a~b 

0 

a — b 

3b 

1 

a 4- 5b 

2a 4-b 

0 

3a 4-3b 

1 


0 

2a+ b 

a 

4- 2b 

= b or a 


-2b. 


If a = 2? = 0, then clearly column rank ^4 = 0. 
If 0 = 2> =£ 0, then 


I 3a 

6 a 

—3a 

3 a \ 

3 a 

3 a 

0 

6a 

3 a 

6a 

—3a 

3 a 

\ o 

-3a 

3 a 

3a / 


We note that c 3 = c x - c 2 and c 4 = 3ci — c 2 and if ^ 0 {c l5 c 2 } is 
linearly independent over R, that is, if a = b ^ 0, column rank ,4 = 2. 
If a = —2b =£ 0, then 


l-3b 

-3b 

0 

—3b\ 

3b 

3b 

3b 

3b 

-6b 

-6b 

-3b 

-6b 

\ 3b 

3b 

3b 

3b 1 
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and c*! - c 2 - c 4 and {c u c 3 } is linearly independent over R, that is, 
column rank A = 2. 

Before proceeding, we note that the row rank of A may be defined 
to be the maximum number of linearly independent row vectors in A. 
We show below that the row rank of a matrix is equal to the column 
rank of A. In the above example 1, note that if the rows are denoted by 
O, r 2 , r 3 , r 4 respectively, then r 3 = r x 4- r 2i r 4 = r 2 - 2r u {r u r 2 } is 
linearly independent over R and so the row rank of A =2. 

If A = ( a ij)^M mn (K ), let T : V n (K) -> V m (K) be the linear 
transformation defined by 

T(x = (y u ...,y m ) 

n 

where y. = 2 a-, x, (/ = 1,. . ., m). Then A is the matrix of T 
7 = 1 

relative to the standard K- bases for V n (K) and V m (K). 

In the previous section, we defined the rank T, we can now prove 
the reassuring theorem that 

THEOREM 4.18 rank T = column rank A. 

PROOF By definition, rank T = (im T : K). 

If 


n 

Oi, • • •, y m ) e im T, then7,. = 2 (/' = 1,... ,m) 

/=1 

or 

(fi> • • • >T m ) - ^2^ x / ( a i/ 5 oc 2 . .. , a m/ .) 

Thus 0 1? . . . ,y m ) is a linear combination of the columns of A and the 
columns of A generate im T over K. Hence rank T = column rank of A - 
as required. 

As a corollary to this theorem we can now prove some important 
statements concerning the column rank of a matrix. 

COROLLARY (i) A matrix AEM n (K) is invertible if and only if 
column rank A = n. 

(ii) If A E M n ( K ) is invertible , then the column ranks of AB, BA 
and B are equal , where B is any n-rowed matrix. 

The proof of (ii) uses Theorem 4.12. 

Furthermore we can prove 

THEOREM 4.19 Row equivalent matrices have the same column rank. 
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PROOF If B is row equivalent to A, then by Theorem 1.17 there 

exist elementary matrices E U E 2 , ... ,E k such that 

B = E X E 2 . . • E k A 

But elementary matrices are invertible and so by (ii) above 

column rank of B — column rank of A 

We can now prove that the row and column ranks of a matrix are equal. 
We First note 

THEOREM 4.20 If a matrix B is obtained from a matrix A by a 
single elementary row operation then 

row rank B = row rank A 

PROOF This result follows immediately on consideration of the _ 
three types of elementary row operations separately. 

COROLLARY Row equivalent matrices have the same row rank . 
Equivalent matrices have the same row rank. 

From this corollary and Theorem 4.19 we now obtain 

THEOREM 4.21 row rank A = column rank A. 

PROOF Let R be the reduced echelon matrix of A. If R has r non-zero 
rows, then consideration of the form of R implies that 
row rank R = column rank R—r. Then, by the above corollary and - 
Theorem 4.19, we have row rank A = column rank A = r. 

From now on we refer to the rank of a matrix only. 

Now consider the system of m linear equations in the n variables 

Xi,x 2 , 

n 

2 % Xj = 0. (/ = 1,2,..., m) 

7 = 1 

Then, by the above, we see that this represents the linear transformation 
T : V n (K) -> V m (K) defined by 

T(x u . . . ,x n ) = (p u p 2 , • • • ,P m ) 

We prove two theorems concerning the solution of linear equations. 

We first consider the homogeneous case. 

THEOREM 4.22 The solutions of the system of linear equations 
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n 

£%*/ = 0 0=1 . m) 

/'=i 

form a vector space of dimension n-rankA, where A = fa..) EM (K). 

^ 171,71 ^ ' 

A non-trivial solution exists if and only if n> rank A. If n = rank A , 
the trivial solution is the only solution. 

PROOF In the above notation, finding a solution of the above system 
is equivalent to finding a v = (x u x 2 ,. .., x n ) E V n (K) such that Tv = 0, 
i.e. vG V n {K ) is a solution if and only if vG ker T. Thus the solution 
space is ker T which is a subspace of V n (K). But by Theorem 4.9 

rank T+ nullity T = (V n :K) = n 
and thus 

nullity T = n — rank T 
= n — rank A 

Hence, the system has a non-trivial solution if and only if 
n — rank^4 > 0. 

If n = rank^l then ker T = {0} and the trivial solution is unique. 

We now consider the non-homogeneous case, i.e. 

n 

2 a ij x i = ft O’ = 

7 = 1 

A solution of this system exists if and only if . .. , /3 W ) E im T. But 
the columns of A generate im T , thus a solution exists if and only if 
(Pi ,. . . , P m ) is a linear combination of the columns of A. 

Let (A | b) be the augmented matrix of A, i.e. the matrix obtained by 
adjoining b = (/3 l5 . . . , j3 m ) t to the matrix A. Then we can prove 

THEOREM 4.23 A solution of the system of linear equations 

n 

2 a ij x j = ft (i= l,... ,m) 

!= 1 

exists if and only if the rank (A\b) = rank A. 

PROOF A solution of the system exists if and only if (Pi, P 2 ,.. . ,(3 m ) 
is a linear combination of the columns of A. 

Thus, if a solution exists, rank (A\b) = rank A Conversely, if 
rank (A\b) = rankH, then either b = 0 or b is a linear combination « 
of the columns of A and so a solution exists. I 
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The results of Theorem 4.22 and 4.23 should be compared with the 
criteria given for the solution of linear equations in Chapter 1. 

EXAMPLES 


1. Do the following systems of linear equations have non-trivial 
solutions? 

(i) *i + 2*2 — *3 + *4 = 0 (ii) 

2*!— *2 +*3+ *4 = 0 

*1 — X 2 +*3 + 2*4 = 0 

(i) /l 2-1 


*! 4- 2*2 — *3 = 0 
2*! — *2 + *3 = 0 
3*1 + *2 + *3 = 0 


A = 


l l 

2 -1 
\l -1 


Clearly, the rank A < 3 < 4 and by Theorem 4.22 the system has a 
non-trivial solution. 



1 0 °\ 

which is easily shown to be row equivalent to 0 1 0 I, i.e. the 

\0 0 1 / 

rank^l = 3 and again by Theorem 4.22 the trivial solution is unique. 
2. Solve, if a solution exists, the system of linear equations 
*i + *2“ 2*3+ *4 + 3*5 = 1 
2*!“ *2 + 2*3 + 2*4 + 6*5 = 2 

3*! + 2*2 — 4* 3 — 3*4 — 9*5 = 3 
The augmented matrix is 



which is row equivalent to 


1 

2 

-3 


3 

6 

-9 



1 

0 

0 

0 

0 

n 

0 

1 

-2 

0 

0 

0 

\0 

0 

0 

1 

3 

0/ 
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Since we clearly have in this case that rank (A \b) = rank A a solution 
exists (by Theorem 4.23). A solution is now found by the methods 
given in Chapter I, i.e. from the above, we see that the system reduces to 

X! — 1 
* 2 ~ 2*3 = 0 
* 4 + 3*5 = 0 


and a general solution is of the form 

(1,2X, X,— 3/i,/i) = (1,0,0,0,0) + X(0,2, 1,0,0 ) + m(0 , 0,0,—3,1) 

3. For the system of linear equations 
*1 + *2 + 2*3 +*4=5 
2*!+ 3*2“ *3 — 2*4 = 2 
4*! + 5*2 + 3*3 = 7 

the augmented matrix is 


( l 

1 

2 

1 

s \ 

(A\b) = 2 

3 

-1 

-2 

A 

\ 4 

5 

3 

0 

7 / 


which is row equivalent to 



0 

7 

5 

i3 \ 

0 

1 

5 

-4 

8 

\o 

0 

0 

0 

-5/ 


which implies that rank (A | b) > rank A and so the system does not 
have a solution. 


-xercvses 4.6 

1. Show that the rank of the matrix 


/1 — \/6 \/3 \ J 2 

2 s/6 -s/2 

\ 1 -s/3 s/2 — s/6 


2 , 2 . 
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2. Find the rank of the following matrices for all values of a 


(i) 

1 

3 5 

6 + a \ 

(ii) 

I 1 1-2 a 

1 

1-2 a 


2 

3 4 -a 

2 


1+2 a —1 —a 

-1 +2a 

1 —a 


1 

1 -a -2 

-5 


2—a 3 a 

2 a 

2-a 


li 

6 12 

19 / 


\ 5 a 1 — a 

1 4-3 a 

0 

Show 

that the rank 

of the matrix 






a b 

b 







b a 

—a 

-b 





a 

+ b a + b 

2a 

—2a 





\- 

~2a 2a 

a + b 

a + b! 





is 4 unless a 4- b = 0 or b = 3a. Find the rank in each of these cases. 

4. Find all values of t for which the matrix 

/(I 4- t)t t- 1 -t 

0 2-1 

\ -2 1 4 — 2 1 t — 

is of rank less than 3 and determine the corresponding rank. In each 
case express one of the columns as a linear combination of the others. 


5. Find the rank of the linear transformation defined in Exercises 4.2, 
No. 7 for all values of a, 0 and 7. 

6. Let S EM n (K) be fixed and T be the linear transformation on 
M n (K) defined by T(A) =AS. If S is an invertible matrix, show that 
rank T = n 2 . In general, prove that rank T = n rank S. 


126 


CHAPTER 5 


Inner Product Spaces 


5.1 Introduction and Three-Dimensional Geometry 

In Chapter III, vector spaces over any field were defined as generalizations 
of 2- and 3-dimensional real spaces and many of the basic concepts in these 
cases were extended to vector spaces in general, e.g. A"-basis, linear trans¬ 
formation etc. But there are in addition, further concepts which prove to 
be useful in these cases, namely the length of a line, angle between two 
lines, perpendicular lines, etc. This chapter will consider the generalization 
of these ideas to vector spaces over the real and complex fields. Before 
considering this generalization, we shall review these concepts in 3-space 

E 3 (R). 

Let v = (a, j3, 7) E V 3 (R), then refering to Figure 1, the length of v, 
which is denoted by \\v\\ is 

||z> || = y/OR 2 + PR 2 

= yJOQ 2 4- QR 2 + PR 2 


= y/a 2 + ft 2 + y 2 

where \J ~means the positive square root. 
Note that if X E R then 


II Xz; || = IXIIIvll 


since || \v || = \A 2 (A 2 + P 2 + 7 2 ) 

- |X| IMI 

vEV 3 (R) is called a unit vector if || v || = 1. For example, 

(1,0,0), (0, 1,0), (0,0, 1), (Jj, 0 ) , (^, jf. are unit 

vectors in V 3 (R). If v e V 3 (R) then -jj-tjj v is a unit vector. 
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z 



I^t 0, 0, <p be the angles which OP makes with the x-,y- and z-axes 

0Q_ a . p 


respectively, then cos 0 = 


, cos \p = 


7 

,,—n, cos (b = —— 

iifir n»ir 


OP \\v\ 

cos 0 , cos \Jj and cos 0 are called the direction cosines of v. 

Now, let v and w be two vectors in F 3 (R), represented by the points 
P and Q respectively, see Fig. 2. 


z 



Figure 2 

The distance between P and Q, denoted by d(P, Q), or the length of PO 
is equal to the length of OP', where P' is the point 

(a-a',0-0', 7 -7/). Thus 
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d (P,Q) = ||(«-a',0-0',7-7')ll 

= V ( a - a ') 2 + (/ 3 -/ 5') 2 + ( T - 7') 2 
= ||v —w|| 


The angle between v and w is defined to be the angle POQ = 0 such 
that 0 < 0 < 7 r. Now, by the cosine rule 

PQ 2 = OP 2 + OQ 2 - 2.OP.OQ cos 0 

that is, 


cos 6 = 


p —w — — vv 


-2 II 


w 


(a - a') 2 + (p ~ PY + (7 ~ y) 2 - (a 2 4- p 2 -f y 2 4- q /2 -F p' 2 4- y 2 ) 

-2 II v II II w II 

aa 4- |3j3 ; 4- 77 ; 

llvllllwll 

If v = (a, p, 7 ), w = ( ol\P\ y f ), then the inner product (or dot product) 
of v and w, denoted by (v, w) (or v.w ) is defined by 

(z>, w) = oca 4- PP' 4- yy 

In that case, from the above, we have 


(v, w) = ||z; || || w || cos 6 

which is sometimes given as the definition of inner product. Note also 
that ||z; || = >J(v, v). 

Two vectors v and w are perpendicular or orthogonal if cos 0 = 0 
or (v, w) = 0. For example (1,0, 0), (0, 1,0), (0, 0, 1) are mutually 


to each other. 

In physical applications the vectors (1,0, 0), (0, 1,0) and (0, 0, 1) 
are denoted by i, j, and k, respectively and as we saw in Chapter III, 

{i, j, k) is the standard R-basis for K 3 (R). Thus, K 3 (R) has an R-basis 
consisting of vectors which are mutually perpendicular or orthogonal. 
This is the foundation for the Cartesian coordinate system which is of 
fundamental importance in the development of geometry. In the next 
section, having generalized the concept of orthogonal vectors to 
arbitrary vector space, we shall show that every real and complex 
finite dimensional vector space has a basis consisting of orthogonal 
vectors. 


perpendicular. Also 


(J2 


■j and 0 > — ^f) are perpendicular 


129 

























The following theorem can be proved concerning the inner product 
THEOREM 5.1 If u, v, w G V 3 (R) and AGR then 

(i) ( w 4 v, w) = (w, w) 4 (v, w) 

(ii) Qw, w) = X(z>, w) 

(iii) (v, w) = (w, v) 

(iv) (v,v)>0ifv=£0. 

PROOF The proof is elementary, for example 
if v = (<*, P, i)> w = (a', P\ 7 ') then 

(Az>, w) = Xaa f 4 A/3/3 ' 4 X 77 ' 

= X(aa ; 4 00 ' 4- 77 ') 

= \(v, w) 

proving (ii). a 2 4- 0 2 4 y 2 > 0 with a, 0, 7 G R if and only if at least 
one of a, 0 or 7 is non-zero, that is (a, 0 , 7 ) ^ ( 0 , 0 , 0 ), which proves (iv ).1 

When vector spaces were first defined in Chapter 3 , the basic properties 
verified in the special case V n ( K ) were taken as motivation for the 
definition of abstract vector spaces. For the new concepts which are to 
be introduced now, it is the properties of the inner product enunciated 
in Theorem 5.1 which are crucial to make things work. Thus, we again 
reverse the process and take the statements in Theorem 5.1 as the basis 
of definition of inner product spaces given in the next section. 

Attention is restricted to the case where the field K is a real or complex 
field (or their subfields). Statement (iv) will only be meaningful if (v, v) 
takes values in an ordered field such as the real field. By modifying the 
requirement (iii), this can also be assured when K is the complex field. 

It would be possible to develop the theory of vector spaces over an 
arbitrary field with a symmetric inner product, i.e. an inner product 
satisfying (i), (ii) and (iii) but it will be seen that (iv) is absolutely 
essential if the crucial concepts of the length of a vector, distance 
between points, the angle between two vectors and perpendicular or 
orthogonal vectors are to be generalized. 

Exercises 5.1 

1. Which of the following pairs of vectors are perpendicular? 

(i) ( 2 ,— 1 , 1 ) and ( 1 , 2 , 1 ) 

(ii) (2, 1,-3) and (1,1,1) 

(iii) (7,5,3) and (1,-2, 1 ). 

2. Find a vector perpendicular to (2, —1,2) and (1,-1,2). 
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3. Find the lengths of the following vectors 

(i) (1,2,1), (ii) (3,-2, 5), (iii) (1,0, —1). 

4. Find the angle between the following pairs of vectors 

(i) (3,—2,1) and (1,-1,1) 

(ii) (2,1, —1) and (1,0,2). 

5.2 Euclidean and Unitary Spaces 

Throughout this section K will stand for either the field of real numbers 
R or the field of complex numbers C and V a A'-space. 

DEFINITION 5.2 An inner product on V is a function which assigns 
to each ordered pair of vectors u, vG V a scalar (u, v)EK with the 
following properties 

(i) (1 u 4- w, v) = ( u , v) + (w, v) for all u y v y w^V 

(ii) (aw, v) = a(w, v) for all u y v G V, a G K 

(iii) (v y u) = ( u, v) y for all u y v^V 

(iv) (v, v)>0ifv¥ z 0 y 

where the bar denotes taking the complex conjugate. A vector space V 
with inner product is called an inner product space. In particular, a real 
inner product space is called a Euclidean space and a complex inner 
product space is called a unitary space. 

Before looking at examples, we note that (i), (ii) and (iii) imply that 

(aw + 0w, v) = a(w, v) 4 0(w, v) 

and 

(w,az? 4 0w) = a(w, v) 4 0(w, w) 
for all a, 0 G K, w, v, w G V, since 


(aw 4 j 3w, v) = (aw, v) 4 (pw, v) 

by (i) 

= a(w, v) 4 p(w, v) 

by (ii) 

(u,av + Pw) = (<ocv + pw,u ) 

by (iii) 

= a(v, w) 4 j3(w, w) 

by the above 

= a(w, v) 4 0(w, w) 

by (iii). 


Note also that (iii) ensures that (y, v) G R and so (iv) is a meaningful 
statement. 
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EXAMPLES 

1 • If u = i a u a 2 , • •., <*„), v=(p u f} 29 ... ,f} n )ev n (R), define 

(w,v) = Otifix + a 2 j3 2 + ... + OC n P n 

then it is easily shown that this is an inner product on V n (R) and thus 
V n (R) is a Euclidean space. This is a natural generalization of the inner 
product on E 3 (R) considered in Section 1, and will be called the 
standard inner product on V n (R). 

2. If u = (ot u oc 2 ,..., oc n ),v= (fix, fa, - . . ,P n )£ V n (C), define 

(u,v) = &iPi +a 2 P 2 +- - • + a n P n 

then again it is easily shown that V n (C) is an inner product space and 
thus a unitary space. This will be called the standard inner product on 

w 

3. If fg E C [a, b] 9 define 


(f,g) - / f{t) g(t) dt 


then (i)-(iv) in Definition 5.2 are familiar properties of integration. 
This will be called the standard inner product on C[a,b ]. 

4 - If k = a* , eg, v = (p l9 P 2 , . .., p n ) E V n (R), define 
(w, v) = aiPi 4- 2ol 2 P 2 + . . . 4- noc n P n 
then it is again easily verified that this also defines an inner product on 

w 

From now on in this chapter V will denote an inner product space. 

DEFINITION 5.3 The length or norm ofvEVdenoted by || v || is 
defined by 

1011 = VO, v) 

A vector vEV with \\v\\ = 1 is called a unit vector. 


Note that by Definition 5.2 (iv) if v =£ 0 (v 9 v) > 0 and so V(yyv) 

is a positive real number. Also if v E V y v fO then —-— v is a unit vector* 
we say that the vector v has been normalized. ^ v ^ 


In the case of V n (R), this is clearly a generalization of length in 
K 3 (R) considered in §5.1. In V n (C) 9 ifv=(a l9 oc 2 ,..., a n )E V n (C) 9 
then 
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(v, v) = + cl 2 ol 2 + ...+ a „ a „ 

= |a 1 | 2 +|a 2 | 2 +...+ |aJ 2 

The next theorem shows that the length of a vector has some of the 
familiar properties of the length of a line in the plane or in 3-space. 


THEOREM 5.4 Ifu, ve VandaEK, then 

(i) ||act;|| = |ac| ||»||, 

(ii) ||*||>0<f**0, 

(iii) |(a, *)|<||a||||*|| 

(iv) ||a + *||<||a|| + ||*||. 

PROOF (i) || a* || = (aVyOcv) 1 

= (aa(*, v)) 1 

= |a| ||*||. 

(ii) follows immediately from (iv) of Definition 5.2. 

(iii) if u = 0, then both sides are 0 and the result holds. 

If u # 0, put w = v — u , then 

Hair 

(w, a) = (*, a) - j^|?(w» u) 

= 0 


and 


0 < || w|| 2 = (w, w) 

* { w - v ~t$ u ) 

= (vv,*) 


±at is 


I (a, v )\ 2 < II a || 2 ||*|| : 


|(a, *)| < ||aIt ||*|| 
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(iv) Now, by means of (iii) above, we have 

ll« + i>|f = (u + v.u + v) = ||u|| 2 + (u, v) + (v,u) + ||i;|| 2 

< llu|| 2 + \(u,v) + (v,u)\ + ||i>|| 2 

^ Hull 2 + l(u,p)| + \(v,u)\ + ||z;|| 2 

< Hull 2 + 2||u|| ||»|| + ||z>|| 2 

= (Hull + ||zfi|) 2 

and hence 

l|u + »||<||«|| + ||»|| I 

(iii) is called the Cauchy-Schwarz inequality and is a generalization 
of familiar inequalities in other settings as we see below and (iv) is 
called the triangle inequality, for in F 3 (R) it reduces to the well-known 
statement that the length of a side of a triangle is less than the sum of 
the other two sides as is seen in Fig. 3. 



Figure 3 


Alternatively, we define 
d(u,v) = || z? — m || 

for all u, v £ V, and call d(u, v) the distance between u and v. Then we 
have an alternative version of the triangle inequality. 

COROLLARY If u, v, w £ V then 

d(«, w) < d (u, v) + d(p, w) 

PROOF Replace u and v in (iv) above by v ~ u and w - v 
respectively, then u + v is replaced by w - u and the statement (iv) of 
Theorem 5.4 now becomes 

d (u, w) < d(«, v) + d(v, w) | 
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In V n (R) and V n (C) with standard inner products (iii) becomes 

O*i0i + a 2 fc + .. . + a n ^ n ) 2 <(a 1 2 + a 2 2 + ... + a n 2 ) 

( 0 1 2 + 0 2 2 +... + 0 „ 2 ) 


and 

I (<*i0i + <*202 + • • • + <*„0„)l 2 

<(|a,l 2 + Ia 2 | 2 + ... + I<*„l 2 )(l0il 2 + I0 2 l 2 + • • ■ + I0„l 2 ) 

respectively. These two statements (the first is obviously a special case 
of the second) are called the Cauchy inequalities. 


In C[a,b] with the standard inner product, (iii) becomes 


[(f(t)g(t) dr) 2 < (//(f) 2 dr) (/ g(t) 2 dt ) 

which again is an important property of integration. 

The concept of angle can also be introduced for real Euclidean 
spaces V. If u, v€. V, then by Theorem 5.4 (iii), we have 

i(“.p)i <, 

Hull \\v\\ " 


or, what is the same, 


-1 


Hull ||v|| 


and (u, v)/\\u\\ \\v\\ is a real number. The angle 6 between u and z?is 
then defined to be that number 0 < 6 < tt such that 


cos 6 = 


(“.*) 

Hull llv 


what has been said above ensuring that 6 is well defined and also that 
this definition is meaningful, i.e. cos 0 is a real number and 
—1 < cos 6 < 1 for all angles 6. It is clear that this definition of angle 
cannot be extended to unitary spaces. If 6 = n/2, then (u,v) = 0 and 
we say that u and v are orthogonal or perpendicular to each other. 

This will be subject of the next section; since the statement 
“(w, v) = 0” is meaningful for unitary spaces also, the concept of 
orthogonal vectors can be introduced for arbitrary inner product spaces. 
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EXAMPLE 


In F 6 (R) with standard inner product, if u = (3, —2, —3, 1, 1,-1), 
v — (—1,0,0, 1, 1, 1), find the lengths of u and v and the angle between 
u and v. Verify (iv) of Theorem 5.4 in this case. 

By Definition 5.3, we have 

||w|| 2 = (u, u) = 9 4- 4 4- 9 + 1 + 1 4- 1 = 25 

\\v\\ 2 = (v,v) = 1 4- 1 + 1 4- 1 =4 

thus INI = 5, ||z>|| = 2. 

The angle 6 between u and v is given by 


cos 6 = 


- 34 - 14 - 1-1 

5.2 



thus e = cos' 1 (-1/5) (as 101° 34') u + v = (2, -2, -3, 2, 2, 0) and thus 
lit/ + t>|| = V4 + 4 + 944+4 = s/2\ <2 + 5 = ||u|| + ||t;|| 


Exercises 5.2 

1. Which of the following are inner products on F 2 (R), if u = (ot u a 2 ), 
v = (Pu ^ 2 )? 

(i) (u, v) = a l p 1 -a 2 p 1 -a I p 2 + 2a 2 p 2 

(ii) (u,v) = a l p 1 +a 2 p 1 +a 1 p 2 — a 2 /3 2 

(iii) (u, v) = aj/3, - a 2 /J, + a,j3 2 + 2a 2 |3 2 

(iv) (u,v) = ai 2 /3i 2 + a 2 2 j3 2 2 . 

2. Which of the following are inner products on F 3 (R), if 
« = (a 1 .« 2 , a 3 ),v=(p 1 ,p 2 , p 3 )? 

(i) (w, t;) = a^! + 2a 2 /3 2 + 3a 3 j3 3 4- a 1 j3 2 + a 2 ^i +aj/ 3 3 

+ a 3 0i + 2a 2 j3 3 4- 2a 3 /3 2 

(ii) (n, z;) = aj/3! + 2a 2 /3 2 + 3a 3 /3 3 + 2ai(3 2 + 2aiji 3 + 4a 2 |3 3 

(iii) («, p) = a,/?, +a 3 )3 3 —a!|3 2 —a 2 j3! —aj)3 3 —a 3 /3j 

4" ot 2 p 3 + a 3 /? 2 . 

3. Which of the following are inner products on C[l,-1] , the vector 
space of real valued continuous functions defined on [—1, 1], if 

/,*ec[i,-i]? 

(0 (/,?) = f_ l f(x)g(x) dx 
00 (f,g) = /* (1 ~ x 2 ) fix) g(x) dx 

(iii) (f,g) = x 2 fix) g(x) dx. 
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4. Which of the following are inner products on M n (R), if A ,B ^M n (R), 

(i) (A, B) = trace ( AB ) 

(ii) (A,B) = det (AB). 

5. Compute ||u||, ||z7||, \\u 4- v\\, (u, v) and the angle between u and v 
and verify that the Cauchy-Schwarz and Triangle Inequalities hold if 

(i) u = (1,0, 2, —2) and v= (2, 1,-2, 0) are elements of F 4 (R) 
with standard inner product; 

(ii) u = (1,0, 2) and v=(2, 1,2) are elements of E 3 (R) for the 
inner products defined in Exercise 2 above. 

(iii) u-x and v = cos irx are elements of C [0, 1 ] with the 
standard inner product. 

6. If V is a Euclidean space and w, vE V prove that 

(i) \\\u\\-\\v\\\<\\u-v\\ 

(ii) 4(u, v) = ||w + z>|| 2 — ||« —v\\ 2 

(iii) ||u - v\\ 2 + || u + v\\ 2 = 2(||//|| 2 + \\vf). 

What does (iii) say about the diagonals of a parallelogram? 

7. If u, vG V, show that the distance function d(w, v) satisfies 

(i) d(u,v)>0 

(ii) d(w,v) = d(v, u) 

(iii) d(w, v) = 0 if and only if u = v. 


5.3 Orthogonal Vectors 

Let V be an inner product space. 

DEFINITION 5.5 If u, vE Vand (u, v) = 0 then u and vare said to 

be orthogonal (or perpendicular ) to each other. A subset S of V is 
called an orthogonal set if the elements ofS are mutually orthogonal. 
An orthogonal set is called an orthonormal set if each vector has unit 
length i.e. \\v\\ = 1. 

EXAMPLES 

1. In F 6 (R) with standard inner product, find the vectors orthogonal 
to v= (3, —2, —3, 1, 1,-1). 

If u = (a u a 2 , a 3 , 04, a 5 , a 6 ) E K 6 (R) is orthogonal to v then 

3a! — 2 a 2 — 3a 3 4 -04 4-a 5 — o 6 = 0 

that is, the set of all vectors u orthogonal to v is given by all the 
: rlutions to this linear equation. 
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Clearly {(1,0, 0,0,0, 3), (0, 1,0,0,0, -2), (0, 0, 1,0, 0,-3), 

( 0 , 0 , 0 , 1 , 0 , 1 ), ( 0 , 0 , 0 , 0 , 1 , 1 )} 

is an R-basis for the solution space to this linear equation and all 
R-linear combinations of these vectors are orthogonal to v. 

2. The standard bases for F„(R) and V n (C) are orthonormal relative 
to the standard inner product. 

3. In C[l, -1], prove that {P 0 (x) = l,Pi(x) =x,P 2 (x) = i(3cc 2 — 1)} 
is an orthogonal set of vectors. Find the length of each element. 


(Po(x),Pi(x)) = / 1 Jcdx = 


M:, 


= H = o 


(P 0 (x), P 2 (x)) = £ j(3x 2 -1) d* = [y -f] 1 


= —---— + — = o 

2 2 2 2 ’ 


(P t (x),P 2 (x)) = jx(3x 2 -l)dx = [^-^J ‘ 


2_i_3 I 
8 4 8 4 


0. 


(P 0 (x),P 0 (x)) = / i dx = [x]l± =1 + 1=2, 

(?iW - PiW) = u x2dx = [fll = 14 = l 

(Pa (JC), P 2 (x)) = f i(3x 2 - 1 ) 2 (be = 6x3 ■ 


H-r 


20 12 4 

thusPo(*), A (*) and P 2 (x) have lengths \/2,and^/y 
respectively. 

LEMMA 5.6 orthogonal set of non-zero vectors in an inner 
product space V is linearly independent. 

PROOF Let {v u v 2 ,. .., z>„} be an orthogonal set of non-zero 
vectors in V. Consider 

<*i Vi + ol 2 v 2 + ... + oc n v n = 0 

where a i G K (z = 1,2Then for 1 < ^ 
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1 to 


0 = 4 = £ a,- (v,, V k ) 

\i = 1 / ( = 1 


= a fc (v h) 

and since ^ =£ 0, (v k , v k ) =£ 0 and so oc k = 0. - 

Thus {v u v 2 ,..., v n } is linearly independent over AT. 

We now prove our main result which shows that every finite 
dimensional inner product space has a basis consisting of orthonormal 
vectors. Furthermore, the proof is constructive in that it gives a 
procedure for determining an orthonormal basis from any given basis 
for V. 

THEOREM 5.7 (Gram-Schmidt Orthogonalization Procedure). 

Every finite dimensional inner product space has a basis consisting of 
orthonormal vectors. 

PROOF Let {^j, v 2 ,..., v n } be a AT-basis for the inner product 
space V. Define the subset {w l9 u 2 ,.. ., u n } of V inductively as follows 


u x = v x 


u 2 — v 2 


i&i“l)„ 

IIM 2 Ul 


— ^3 


(Vj, Uj) 

lluall 2 


U 2 - 


( v ^ “p .. 
IkTF 1 


u n 


, _Mh) 

" Il«„_lll 2 U n -1 


llwill 2 


«1 


that is, the coefficient of Uj in u f j < i is the inner product of with u- 
divided by the length of Uj. 

Then u u u 2 ,... ,u n are non-zero, otherwise we contradict the linear 
independence of {v u ..., v n ). We show that {u x , u 2 ,... ,u n } is 
orthogonal by induction on n. If n = 2, then 


(«2,Wl) = (P2,«l)- ^’^ («t,“l) = 0 
md so w 2 } is an orthogonal set. 

If n > 2, assume that {w l5 u 2 ,. . . , is an orthogonal set. To show 
‘iiat { u u u 2 ,. .. , u n ) is an orthogonal set, we must show that 
>u n , w.) = 0 (/ = 1,2,. . . , n — 1). Now, for /= 1,2,— 1, we 
.".ave by the induction assumption that 
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(f/ f/4 = (n j ii \ — — l), .. \ (Vm^l) s \ 

1 "’ l) ( "’ ' } lUvTF “<)•••- luTTipr(«i. “/) 

= (v n , U,) - “/H“r *4 


= 0 


as required. 

Furthermore, by the previous lemma, {u u u 2> ..., w„} is linearly 
independent over K and so is a X-basis for V. Now, put 

w i = jjTTTj u i O' = 1,2,..., n) and each vv- is a unit vector and 

{Wi, w 2 , ■ ■ ■ , is an orthonormal basis for V. I 

In §5.2, Example 2 we defined the standard inner product on V (C); 
we now show that every inner product on a finite dimensional inner 
product space essentially takes this form. 

If F is a finite dimensional inner product space then by the above 
theorem it has a X-basis {v u v 2 ,.. ., v n \ consisting of orthonormal 

vectors. If u, vE V then u = 2 a t v t , v= 2 0. v, (a f , 0 j eK) and 




= 2 a,. 0. (z>.,».) 

i,j = 1 

= z 

i = 1 

since (v { , vj) = 8 ti (i,j = 1,2,..., n). 

EXAMPLES 

1. Apply the Gram-Schmidt orthogonalization procedure to the 
vectors = (1, 0 , 1), v 2 = (1, 0 , -1), z> 3 = (0,3,4) to obtain an 
orthonormal basis for F 3 (R). 

Put 

“i = Vi = (1,0, 1) 

U 2 = v 2 = (1,0, —1) 
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~ ^3 


(^3? U l) _ (^3? ^l) 

(m 2 ,m 2 ) («i.«i) 


= (0, 3,4) — t— ^ (1,0, —1) — 4(1,0, 1) 


= (0,3,0) 

then {ui, u 2 , u 3 } is an orthogonal set. The lengths of these vectors are 
\/2, y/2, 3 respectively so the required R-basis for K 3 (R) is 

j^(l,0,l),^=(l,0,-l),(0, 1,0)1. 


2. Extend the orthonormal set fi(2,0, —1,2), %(2, 1,0, —2)} to give 
an orthonormal basis for F 4 (R). It is clear that the set 
{v t = 4(2,0,—1,2), z; 2 = 4(2,1,0, —2), z; 3 = (1,0,0,0), 
z> 4 = (0,0,0, 1)} is an R-basis for F 4 (R). Put 


u i = v x 
u 2 = v 2 


“3 = 


(U 2 ,U 2 ) («1. «l) 

(1,0, 0,0)-f 4(2, 1,0, —2) — | 4(2,0, —1,2) 
i (1,-2,2,0) 


« 4 = V A ~ ^ 4 ’ U 3 
(«3,M 3 ) 


( p 4,«a) u _ (P4,«i) 

(«2,»2) («l.Ml) 


(0,0,0, l)-0-(-§)4(2, 1,0,-2)4(2,0,-1, 2) 

(0, £ l b) 


After normalizing these vectors we find that 
U (2, 0 ,- 1 , 2)4 (2,1,0,-2)4 (1.-2,2,0)4 (0,2,2,1)} 
is the required orthonormal basis. 

3. Let V be the subspace of real polynomials of degree < 3 in C [1, — 1 ]. 
Find an orthonormal basis for V. 


{fo = 1, f\ = *, fa = x 2 , fa = x 3 } is an R-basis for V. 


By the Gram-Schmidt Orthogonalization Procedure we obtain an 
orthogonal set of vectors { go y g\, g 2 > g 3 } as follows: 
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go = fo = 1 


g\ - fi - 


(flifo) f 
(foJo) 10 


f^xdx 

X —- . 1 = X 

f-l&C 


g2 — fl~ 


£l) ~ _ (fli go) 
(gugl) (go>go) go 


2 fJi X 3 dx 
X - X — 

f} x x 2 dx 


f} t x 2 dx 
f -1 dx: 


• 1 


= i(3x 2 - 1) 


g3 - h~ 


(f h gl) _ 

(j?2,gl) gl 


(f3>gl) 

(gugi) gl 


(f3i go) 
(goigo) g ° 


3 /-i i (3xr 2 l)x 2 dx /_ 1 1 x 4 dx 

fJii( 3x 2 -l) 2 dx } f_\x 2 dx 

fji x 3 dx 

--1 

/-i dx 

= i (5x 3 — 3x) 

To normalize these elements, we note that 

(gofgo) — 2, (gugi) = §, (ghgi ) = 

(<§■ 3 ^ 3 ) — 

and the required orthonormal basis is 

(7? A x > vf (3 * 2 ~ 1 }> Jl (5x3 - ^ j 

These polynomials are the first four of the Legendre polynomials which 
are important in analysis. For further information on these and other 
applications of inner product spaces in analysis, e.g. Fourier series, see 
D.L. Kreider, R.G. Kuller, D.R. Ostberg and F.W. Perkins ( loc . cit.). 

Exercises 5.3 

(Unless otherwise stated the inner products are the standard inner 
products.) 
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1. Show that each of the following pairs of vectors are orthogonal 

(i) (2, 3,-2, 1,0, 1) and (2,-1, 1,0,2, l)in F 6 (R) 

(ii) (/, 1, —0 and (1 — 1 ,2, 1 + i) in V 3 (C) 

(iii) 1 and cos nx in C [0, 1 ]. 

2. Find all vectors which are orthogonal to the following 

(i) (—1, 1,2, —1) in F 4 (R) 

(ii) (1 1 + /) in V 2 (C ) 

3. In C[0, 1] prove that cos2ra7rx and cos2wrx (m ± n) are orthogonal 
and find a quadratic polynomial orthogonal to 1 andx. Furthermore, 
find the length of cos mux and find a necessary and sufficient condition 
for a + bx and c + dx to be orthogonal in C [0, 1 ]. 

4. Show that sin nx, sin 2nx ,. . . , sin nnx are orthogonal in C [0, 1 ]. 
Obtain an orthonormal set of functions from these. 

5. Use the Gram-Schmidt Orthogonalization Procedure to 
orthogonalise 

(i) {(1,-1, 1), (2, 1,1), (1,0,1)} in F 3 (R) 

(ii) {(1,-1, 1, 1), (0, 1,0, 1), (2, —1, 1, 1)} in F 4 (R) 

(iii) {(1,-1,0, O', 1 > 2 )} in F 3(C). 

6. Complete to an orthonormal basis 

(0 ((^ 0 >yf)’( 0 ’ 1 ’°)| for ^( R ) 

(ii) lj(M, 1,0, 5 0, 1,0 1)} for f 4( c )- 

7. Find orthonormal bases for F 3 (R) for the maps in Exercise 2 in 
§5.2 which are inner products. 

8. Let V be the subspace of C[0, 1] containing real polynomials of 
degree at most 3. Apply the Gram-Schmidt Orthogonalization 
Procedure to the R-basis {l,x,x 2 ,x 3 } for V. 

9. If [v u v 2 ,. . • , v n } is an orthonormal basis for an inner product 
space V, prove that every vE V can be expressed as 

n 

V = 2 (v, V ,) v t 

i = l 

(i) Find an orthonormal basis for U 3 (C) and find the co-ordinates 
of (1, /, — i) relative to this basis; 

(ii) Find an orthonormal basis for the subspace of C[0, 1 ] 
consisting of polynomials of degree at most 2 and find the co-ordinates 
of x 2 4- 1 and x 2 — x + 1 relative to this basis. 
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10. Apply the Gram-Schmidt Orthogonalization Procedure to the 
R-basis {l, x, x 2 , x 3 } for P 3 (R) where the inner product is defined by 

Jfj 0 -X 1 ) f(pc)g(x) dx 

11. Let Vbe a finite dimensional inner product space with orthonormal 
basis &= {v u ... , v n ) and T a linear transformation on V. Prove that 
CO* = ((TVj, v t )). 

12. If Fis a Euclidean space and u,v£V are such that ||u|| = ||z;||, 
prove that u — v is orthogonal to u + v. What does this say about the 
diagonals of a rhombus? 


5.4 Application to the Rank of a Matrix 

In order to apply the above to prove that the column rank of a matrix 
is equal to its row rank we need first to introduce some additional 
concepts. For the first part of this section, let Fbe an arbitrary if-space. 

DEFINITION 5.8 Let Uand W be subspaces of V, then V is called 
the direct sum of U and IP, written V = U © W if 

(i) V=U+W 

(ii) every element vofV can be uniquely expressed as v = u + w, 
where u £ U, w £ IP. 

The first lemma gives an alternative criterion for V to be the direct 
sum of U and IP. 

LEMMA 5.9 IfV=U+W then V — U ® W if and only if U n W = { 0). 
PROOF YfvEU^W then vE U and vEW then 
v = + 0 = 0 + z; 

and since V= U® W, = 0 and £/ n IP = { 0 }. 

Conversely, assume that UD W = {0}. 

IfvEW, suppose that v = u 1 +w l = u 2 + w 2 , where u u u 2 £ U, w u 
w 2 EW are two expressions for v, then u x — u 2 = w 2 — w x £ f/O W 
and so u x = u 2 , w x = w 2 and the above expression for v is unique. I 

LEMMA 5.10 If V is a finite dimensional K-space and U is a subspace 
of V , then there exists a subspace W of V such that 

V = u®w 

and (V:K) = (U:K) + (W:K) 


144 


PROOF If {v u . . . , v m j is a A^-basis for U, extend this basis to give 
aA^-basis {v u . . ., v m , v m + l ,. . . , v n } for V. Let W be the subspace 
generated by {v m + v . .., v n \, then it follows that V = U © W and . 
(' V:K) = (U:K) + (W:K ). ■ 

From now on, let Fbe an inner product space. 

DEFINITION 5.11 IfS is a subset of V, the orthogonal complement 
S 1 = {x £ V\(x, s) = 0 for all s£S}. 

LEMMA 5.12 IfS is a subset of V, S 1 is a subspace of V. 

PROOF is non-empty since 0 £ S l . 
lfu,v£S 1 ,a£K, then 

(i ocu + v,s) = a(w, 5 ) + (v, s) = 0 for all s £ S 

and so au +,v£S l andS 1 is a subspace of V. 

THEOREM 5.13 Let W be a subspace of a finite dimensional inner 
product space V , then 

V = If © IP 1 

PROOF We can clearly assume that W =£ {0} and W =£ V. If 
(V : K) = n and (I V \K) = r<n then by Theorem 5.7 If has an 
orthonormal basis {w u w 2 ,... ,w r }. Extend this to give a^-basis 
{wi,. .., w r , v r + 1 ,..., v n } for V. Applying the Gram-Schmidt 
orthogonalization procedure to this AT-basis will give an orthonormal 
basis.{uq,.. ., w r ,. .., w n } for V. We show that {w r + v ... , w n } is a 

i n 

/T-basis for W 1 . If v E W l , then v€ V and v = 2 a ( - w t . If 1 < i < r 

/ n ^ 1=1 n 

then 0 = (v, w,) -I 2 a w-, w ( . \ = a ( . and v = 2 a i w ( , i.e. 

\/=t ' / i=r+l 

{w r + 1 ,..., w n } generates If 1 . 

Conversely, every K -linear combination of {w r + l ,... , w n } is in If 1 
and thus {w r + 1 ,..., w n } is a £-basis for If 1 , and by Lemma 5.9 

V = If © If 1 

We now apply this to the solution of a system of linear equations. 
Consider the system of linear equations 

2 a• • x, = 0 (/ = 1,2,. .. ,m) 

7 = 1 
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If A = (a^) E M m n (K) and x = (jcj, x 2> ..., x„) then this system may 
be represented in matrix form as 

Ax x = 0 

Now let r. = (ot iv a i2 , .. ., a /w ) G V n (K) (i = 1,2,..., m\ then this 
system can also be represented as 

(r i9 x) = 0 (/= 1,2,... ,w) 

Let be the subspace of J^(AT) generated by {r l5 r 2 , . . . , . 

Thus the solution space of the system is R A . 

But, by Theorem 5.13 

w = R a **a 

and the dimension of the solution space = (7^ :K) = n — (R A : K ), 
where ( R A : K) is the row rank of A. However, by Theorem 4.22, the 
dimension of the solution space is ^-column rank of A. Thus, we have 
proved 

THEOREM 5.14 7/^4 &M mn (K) then the row rank of A is equal to 

the column rank of A. 

We note that the above proof is valid only over the real and complex 
fields although the result is true for arbitrary fields as was seen in 
Chapter 4. The interested reader can look up, for example, S. Lang, 
Linear Algebra (Addision-Wesley), to see how the general case can be 
handled by similar methods, where (iv) of Definition 5.2 has been 
replaced by another condition (non-degeneracy of inner products) 
which allows the above to be applied to V n (K) for arbitrary K 
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CHAPTER 6 


Diagonalization of Matrices and 
Linear Transformations 

6.1 Introduction 

Let A,B£M n (70, then in Definition 4.5 of Chapter IV we have defined 
similarity of matrices by saying that B is similar to A (written A ~ B) 
if there exists an invertible matrix P^M n (K) such that 

B = P~ X AP 

The following lemma is easily proved. 

LEMMA 6.1 Similarity of matrices is an equivalence relation on 
M n (K). 

PROOF ~ is reflexive since A = I" 1 A I for all A £ M n (K ). ~ is 
symmetric since if A ~ B there exists an invertible matrix P G M n (AO 
such that B - P~ l AP from which it follows that A = (P -1 )" 1 BP~ l and 
B ~ A. Finally, ~ is transitive since if A~B and B ~ C there exist 
invertible matrices P, Q G M n ( K ) such that B = P l AP and C = Q 1 BQ 
and hence C = Q~ x P~ l APQ = (P0 _1 A(PQ), where PQ is an invertible _ 
matrix in M n (K). ■ 

This means that M n ( K ) is partitioned into equivalence classes under 
this equivalence relation. We have the problem of determining 
representatives of these equivalence classes which are in as “simple” a 
form as possible. This will be one of the main goals of this chapter. 

This problem may also be formulated in terms of linear transformations. 

Let V be a finite dimensional TCspace with (V :K) = n and T G &(V). 
In Chapter IV, we saw that the matrices of T relative to different 
TC-bases for V are similar to each other. Thus, the above problem is 
equivalent to that of determining a 7^-basis for V such that the matrix 
of T relative to that K -basis is as “simple” as possible. To this end, we 
define in the next section the eigenvalues and eigenvectors of a matrix 
(or of a linear transformation). 
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This may be further illustrated by considering an example. A linear 
transformation T on K 2 (R) is defined by 

7ej = 2ej — e 2 

7e 2 ~ 3ej — 2e 2 

where {e 1? e 2 } is the standard basis. Given an arbitrary point 
(X, /i) e V 2 (R), then 

T(X,/i) = 2Xej — Xe 2 + 3/rej — 2jue 2 

= (2X + 3fji, — X — 2/jl) 

Now let ej = 3ej — e 2 , e 2 = — e 2 , then {ej, e 2 } is also a R-basis for 

F 2 (R) and 

T q i = 3 (T^i) — Tq 2 = 6ei — 3e 2 — 3e! + 2e 2 = 3ej — e 2 = e[ 

? e 2 = Tq x — Tq 2 = 2e 1 — e 2 — 3e 1 4- 2e 2 = —e l -f e 2 = —e 2 

and T(X, //) = (X, ~/r), where now (X, ju) are the co-ordinates relative 
to the R-basis {e l5 e 2 }. Thus the effect of the linear transformation T 
on an arbitrary point (or vector) is far clearer in terms of the second 
R-basis, i.e. we have reflection in the first axis e'j as shown in the 
Fig. 1, i.e. to find the image of the point P we draw a line through? 
parallel to e 2 , the image of P will be at a point P ' so that PX = XP\ 
where X is the point where this line cuts the e(-axis. 



Figure 1 
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6.2 Eigenvalues and Eigenvectors 

DEFINITION 6.2 (1) If A 6M„ ( K ), then an element X&Kis called 

an eigenvalue of A if there exists a non-zero X G V n (K) such that 
AX = XX (X is a column vector ). The vector X is called an eigenvector 
corresponding to the eigenvalue X. (2) If V is an n-dimensional K-space 
and T G then an element XGK is called an eigenvalue of T, if 
there exists a non-zero vGVsuch that Tv = Xv, v is called an eigenvector 
corresponding to the eigenvalue X. 


The connection between (1) and (2) is brought out by the following: 

If A is the matrix of T relative to aXf-basis = {fj. v n ) and 

X is an eigenvalue of T and v is an eigenvector corresponding to X and 

I a A 
M 

. , then AX = XX, that is, X is an eigenvalue 


n 


v= 2 0, Vj ; put X = 
1=1 


w 


of A. Furthermore, if SB' = {v[,, v' n ) is another K-basis for V, then 

n 

(T)#. = PAP' 1 , where P = (p t j) and v f = S v, (j = 1.n) 

(see p.98) and hence 


(PAP~ l )(PX) = \(PX) 

i.e. X is also an eigenvalue of PAP~ l with corresponding eigenvector?^ 
(which is non-zero since P is invertible). Thus, we could have defined 
the eigenvalues of a linear transformation to be the eigenvalue of the 
matrix of T relative to any Agasis for V. In the sequel, we concentrate 
on matrices but will state and prove some of the results in terms of 
linear transformations. 

If X £ K is an eigenvalue of A € M n ( K ) then from Definition 6.2 
there exists a non-zero column vector X such that 


04 — XI) JT = 0 

By Corollary 3 to Theorem 2.7 such an X exists if and only if 

detG4 — XI) = 0. 

Thus we have proved 

THEOREM 6.3 \£K is an eigenvalue of A if and only if 

det{A —XI) = 0. 

If A €M n (K), then det (A -*I) is a polynomial of degree n inx. 
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DEFINITION 6.4 X A (x) = det (A — xf) is called the characteristic 
polynomial of the matrix A. 


Thus, Theorem 6.3 may alternatively be stated as 

THEOREM 6.3 ' A E X is an eigenvalue of A if and only if it is a root 
of the characteristic polynomial of A. 


This gives us a practical method for computing the eigenvalues of A. 
X A (x) is a polynomial of degree n in x; by the so-called fundamental 
theorem of algebra, every polynomial of degree n over the complex 
field factorises completely into n linear factors over C, 
i.e. ifx A (x) = (.-l) n (x n + a 1 x n ~ 1 + . .. + <*„), 
where a u a 2 ,. . . , a n G C, then 

X A (x) = (-!)"(* - *i)(* ~ A 2 ) . . . (x - X n ) 

where A 1? X 2 , . . . , X n G C. Thus over the complex field, an n x n matrix 
has at most n eigenvalues (and has at least one eigenvalue). It is also 
clear that the existence of eigenvalues depends on the field over which 
we are working. 


EXAMPLES 

1. If ^4 = then 


X A (x) = 



1 


—x 


x 2 + 1 


which is irreducible over R (i.e. does not factorise in R), but it factorises 
over C to give 

X A (x) = ( x + i) (x — i) 

Hence A has no eigenvalues in R but has the eigenvalues i and — i in C. 

2. If A = ^ then 



2 —x 1 


1 1 

xM*) = 

I —1 —X 

= (1 -x) 

—1 —X 


Thus A has one eigenvalue X = 1 in R (and also in C). 

Once the eigenvalues have been determined, the corresponding 
eigenvectors are easily calculated, i.e. if ^4 = (& z y) and X is an eigenvalue 
of ^ and X = (x u x 2 ,.. . , x n ) is the corresponding eigenvector, then 
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from Definition 6.2 X is a non-trivial solution of the system of linear 
equations 

(an — A)*i + <*12*2 + • • • + a i n x n ~ ^ 

Ci 2 \X i 4- ( a 22 — A)*2 + • • • + a 2 n x n — ^ ) 

a ni x x + oc n2 x 2 + . .. + ( a nn — X)x n =0 ) 

A non-trivial solution certainly exists since det (A - AI) = 0. We shall 
see later (Theorem 6.9), that the number of linearly independent non¬ 
trivial solutions is no higher than the multiplicity of x — A, in x A (*)• 

corresponding to the 

eigenvalue i, we must solve 
—i*i +*2 = 0 
—x x — \x 2 = 0 



In the above two examples, if A - 


Since reduction of the matrix of coefficients gives 

Cl -!M i o) 

then all solutions are of the form a (1 ,i), a € C and thus ( 
eigenvector corresponding to the eigenvalue X = i, that is 

c: -i) CMS) 

Similarly, if X = — i then 


:) is an 



1 ( 1 ) is an eigenvector corresponding to 

O' (2 

In the second example, when A - l j 


the eigenvalue X — — i. 

q) thenx j4 (^) = (' v- O 2 



and there is only one linearly independent eigenvector corresponding 
to X = 1. 

EXAMPLE As a further example we have 
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A 


2 


4\ 

2 


1 3 

2 0 

\4 2 3 / 

then the characteristic polynomial is gi yen by 



3 —* 

2 

4 


—1 —* 2 

4 

= 

2 

—* 

2 

— 

0 

x 2 


4 

2 

3 —* 


1 + * 2 

3 —* 



-1 

2 

4 


—* 2 

= (1 +x) 

0 

—* 

2 

= (*+!) 

4 7- 



1 

2 3 

— * 



= 

~(x + \) 2 (x 

-8) 





The eigenvalues over R are —1 and 8. 

Corresponding to the eigenvalue X = — 1, we have the system of linear 
equations 

Axi + 2*2 4" 4*3 = 0 ' 

2*i + * 2 + 2* 3 = 0 > 

4*i 4- 2*2 4- 4*3 = 0 ] , 
which clearly reduce to the one equation. 

2 *i +*2 4 - 2*3 = 0 
i.e. *2 = —2*i~ 2*3 

and (1,-2, 0) and (0, -2, 1) are two linearly independent solutions. 
Thus, corresponding to the eigenvalue X = -l, there are two linearly 
independent eigenvectors. 

M 2 4 \ / 2\ m /2\ 

If X = 8, then ( 2-8 2 1 ^ 1 J = ^0 J and^ 1 jis an 

eigenvector corresponding to the eigenvalue X = 8. 

In order that the above ideas may be extended to cover linear 
transformations T, we prove the following lemma. 

LEMMA 6.5 Similar matrices have the same characteristic polynomial 
md hence the same eigenvalues. 
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PROOF If A and B € M n (K ) are similar matrices, there exists an 
invertible matrix P such that B = PAP' 1 . Now the characteristic 
polynomial of B is 

X B (x) = det (B — xI n ) = det (PAP' 1 — xl n ) 

= det (P(A —xI^P' 1 ) 

= (det P) det (A — xl n ) det (P -1 ) 

= det (^4 — *1^) 

= X A (x) 

where we have used the fact that the determinant function is 
multiplicative (Corollary 2 to Theorem 2.7) and that . 

det (P _1 ) = (det/ 3 ) -1 . ■ 

Thus, if T E jg^(K), since the matrices of T relative to distinct 
Abases for V are similar to each other, (see Theorem 4.6) the following 
definition of the characteristic polynomial of T is unambiguous. 

DEFINITION 6.6 IfTG &(V), the characteristic polynomial ofT is 
the characteristic polynomial of the matrix of T relative to anyK-basis 
forV. 

We can now prove 

LEMMA 6.7 Let X be an eigenvalue of A E M n ( K ) (T E & (F)) and 
let F(X) denote the set of eigenvectors corresponding to X together 
with the zero vector. Then F(X) is a subspace of V n (/Q (F). 

PROOF It is clear that F(X) is the solution space of the system of 
linear equations represented by 

(A-\l n )X = 0 

and by §3.3, Example 4, it is a subspace of V n (K). . 

(Similarly F(X) = ker (T — XI K ) which by Lemma 4.7 is a subspace of F). ■ 

DEFINITION 6.8 F(X) is called the eigenspace corresponding to the 
eigenvalue X. 

The next theorem gives an upper limit on the dimension of F(X), 
that is, on the number of linearly independent eigenvectors corresponding 
to the eigenvalue X. 

THEOREM 6.9 If X is an eigenvalue of A EM n (K) (TE & (F)), then 
(F(X): K) < multiplicity of(x — X) as a factor in the characteristic 
polynomial X A ( x ) (Xr( x ))* 
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PROOF We shall give the proof in terms of linear transformations. 


Suppose that (F(X): K) = r , let {v u ..., v r ) be a /C-basis for K(X), 
which is extended to give a AT-basis = {v i9 ..., v r ,..., v n ) for V. 
Then we have Tv- — 1,..., r) and 


(T)* 



where we are not interested in the explicit values in the part indicated * 
and A is an (n—r)x(n— r ) matrix. Thus 

X T (x) = det ((T)jg-xl n ) 



\-x- ■ -o 


det 

• • 

6* • -*x— x 



° 

A xl n - r 


= (X— x) r det (4 —xl n _ r ) 

Hence (X — x) r is a factor of the characteristic polynomial of T and 
thus r < multiplicity of (x — X) as a factor in the characteristic 
polynomial x T (x) of T. 


Exercises 6.2 


1. Find the eigenvalues of the following matrices over (a) the rational 
field Q, (b) the real field R, (c) the complex field C. 


0) / 1 V2\ (ii) /l “I ~1\ 

\-\/2 ll 1-1 0 

\1 0 - 1 / 


(iii) /l 0 0 0 

0 1 1 0 

0 1-1 0 
0 0 0 1 

\0 0 0 -1 
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2. Find the characteristic polynomial, eigenvalues and eigenvectors of 
the following matrices over the complex field 


o) 

(' 

0 

-i\ 


(ii) 

(° 

1 

°l 







1 

2 

i 



0 

0 

1 








2 

3 ) 



ll 

-3 

3 ) 






on) 


2 

i 

1 +2i\ 

(iv) 

( l 

1 

1 

°\ 





—i 

0 

— 

i 



1 

1 

0 

1 




\l 

— 2i 

i 

0 

1 



1 

0 

1 

1 











lo 

1 

1 

1 / 



(V) 

3 

2 

2 



(vi) 

/' ■ 

1 

1 


1 

°\ 


2 

3 

2 

-1 



^ 0 

0 

2 


2 

2 


1 

1 

2 

-1 



0 

1 

0 


-1 

-1 


u 

2 

2 

- 1 / 




l 

0 

-2 


-3 

-2 1 








\ • 

l 

0 

2 


4 

3/ 


3. If A , B E M n ( K ), prove that AB and BA have the same eigenvalues. 

4. If X 1? . . . , \ n are the eigenvalues of A E M n ( K ), prove that 

(i) if ^4 is invertible, 1/X l5 . . . , \/\ n are the eigenvalues of^ -1 , 

(ii) X k i ,. . . , X^ are the eigenvalues of A k (k = 1,2, . . .). 

What are the corresponding eigenvectors? 


5. Find the eigenvalues and eigenvectors of the differentiation 
transformation D on P n (R). 

6. If A EM n (K), prove that A and^4 t have the same eigenvalues. 

7. Find the eigenvalues and eigenvectors of reflections and rotations 
in F 2 (R) over the real field. 


8. Find the characteristic polynomial of the following n x n matrices 


(i) 

jo 

0 

. . 0 

al 

(ii) 

/1 +b 

a 

a 2 . 

. . a"” 1 \ 


1 

0 

. . 0 

a 2 


1 

a + b 

a 2 . 

. . a"' 1 \ 


0 

1 

. . 0 

a 3 


1 

a 

a 2 + b . 

. . a n ~ l 


lo 

0 

. . 1 

%l 

1 

1 

a 

a 2 . 

. . a n ~ l + b 1 
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6.3 Diagonalization of Matrices 

DEFINITION 6.10 (i) A matrix A EM n (K) is diagonalizable if there 

exists an invertible matrix P such that P~ l AP is a diagonal matrix. 

(ii) A linear transformation T E £f(V) is diagonalizable if there 
exists a K-basis for V such that the matrix of T relative to this K-basis 
is a diagonal matrix. 

It is clear from what has been said earlier that these two definitions 
are equivalent. 

We now prove a necessary and sufficient condition for a matrix to 
be diagonalizable. 

THEOREM 6.11 A matrix A EM n (K) (TE Sf(V)) is diagonalizable 
if and only if a set of eigenvectors of A{T) form a K-basis for V n (K)(V). 

PROOF If A is diagonalizable, then by Definition 6.10 there exists 
an invertible matrix P such that P~ l AP = D = diag (X 1} X 2 ,. . . , X^). 
Since similar matrices have the same eigenvalues, the eigenvalues of A 
are X l5 . . ., X„. Let P = (C u C 2 ,. . . , C n ) 9 where C- (i = 1,.. . , n) 
indicate the columns of P. Since P is invertible, {C\,..., C n } is linearly 
independent over K and thus forms a K -basis for V n (K). Now 

AP = PD 

or A(Ci, C 2 ,..., C n ) = (C u C 2 ,. . . , C n ) diag(X 1# ..., X„) 
implies that 

AC; = X,C f (1=1,...,») 

that is C; is an eigenvector corresponding to the eigenvalue X,- of A. 

Let { X x , X 2i ... , X n } be a linearly independent set of eigenvectors 
of A. Suppose that these eigenvectors correspond to the eigenvalues 
Xi, X 2 , . . . , X„ (not necessarily distinct). If {X x , X 2 ,.. . , X n ) is linearly 
independent over K then if P = (X u X 2 ,..., X n ) is the n x n matrix « 
formed with X; (i = 1,..., ri) as its columns then by the Corollary to 
Theorem 4.18, P is invertible and by inverting the above argument we 
have that 

AX; = X;X; (1=1,..., #1) 

implies that 

P-'AP = diag(X„X 2 ,...,X w ) I 

The above theorem implies that if A is diagonalizable then x^(*) 
factors completely into linear factors. 
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We can now prove that 

THEOREM 6.12 If A EM n (K) (T E has n distinct eigenvalues 

then A(T) is diagonalizable. 

PROOF Let X l5 X 2 ,..., \ n be the n distinct eigenvalues of A and 
X u X 2 ,. . . , X n be the corresponding eigenvectors, thus 
AX; = X; X; (z = 1,. . . , «). We need only prove that {X u X 2i . . . , X n } 
is linearly independent over K then the proof will be complete by 
Theorem 6.11. The proof of this is by induction on n. This is clearly 
true when n- 1. We shall assume that {X u X 2) ..., X r _ x } is linearly 
independent over K , where 1 < r — 1 < n. Consider 

a.\X\ + cl 2 X 2 + ... + oc r X r = 0 

where oc; E K (i = 1,..., r). 

Premultiplying by A gives 

(*iX l X l + a 2 \ 2 X 2 + . . .+a r \ r X r = 0 

and subtracting this from X r times the previous equation gives 

°h(X r ~ X^Zi 4-... + a r _ 1 (X r — X r _ 1 )A r r _ 1 = 0 

But {Xu ..., X r _ x } is linearly independent over K and so 

a,.(X r -X I .) = 0 (/= 1,... ,r- 1) 

Furthermore (X r — Xf) ¥= 0 (/ = 1,. .., r — 1) since the eigenvalues are 
distinct and so aq = a 2 = .. . = a r _ x = 0 and cc r X r = 0, and since 
X r 0, a r = 0, i.e. {X u . .. , X r } is linearly independent over K - 
{1 <r<n). ■ 

From the proof of Theorems 6.9 and 6.11 we have the following 

COROLLARY If A EM n (K ) is diagonalizable then 

(F(X): K) = multiplicity of (x — X) as a factor in x A (x) for each 

eigenvalue X of A. 

We. note that the converse is also true when the characteristic 
polynomial factors completely into linear factors, e.g. when K = C. 

We refer back to the earlier examples (pp. 151—152). 

EXAMPLES 

1 . = q), put/ > = (J | then 

P~ l AP = diag O',—/) 
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2. If A 



, then since (1, —1) is the only linearly independent 


eigenvector corresponding to the only eigenvalue 1, (L(l) : R) - 1 and by 
Theorem 6.11, A is not diagonalizable. 


1 3 2 4 \ 

3. If ^4 = 2 0 2 , the eigenvalues are—1,8. 

\4 2 3/ 

We have shown that {(1, —2, 0), (0, —2, 1)} is a R-basis for V(—l) and 
{(2, 1,2)} is a R-basis for K(8). If we put 


P = 


i 

-2 

\ 0 


0 

-2 

1 



then 


P~ l AP = diag (—1 ,—1,8) 


We now assess the progress which has been made on the problem 
which was the main motivation for the work in this chapter, namely 
that of finding matrices of a “simple” form to be representatives of the 
equivalence classes under the equivalence relation of similarity. We have 
seen that in certain circumstances the “simple” form chosen is a 
diagonal matrix, but unfortunately, as we have seen in Example 2 above 
this simple form cannot always be chosen to be a diagonal matrix. We 
now state the solution in the general case without proof, the proof 
being beyond the scope of this book. 

We assume that K = C the complex field. 

Let A €M n (K) have characteristic polynomial 

X^OO = (“I)" (x ~ - ^)" 2 ...(x- \ s ) n * 


where X x , X 2 ,..., X 1 are the s distinct eigenvalues of A and 
n i + n 2 + • • • + n s = n. 

Let 


J(K,r) = 


X 1 0 
0 X 1 
0 O'. 

0 0 . 


0 . . . 0 

0 . . . 0 


\ 



EM r (K) 
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If now m iv m i2 ,. .. , m is are positive integers such that 
m il > m i2 > .. . > m is and 


m n + m i2 + .. . + m iSj = n t 


let 


Jin t ) = 


m n ) 0 

0 /(X,.,m l2 ) 




0 


then the matrix A is similar to the matrix 




for some choice of m i - (i = 1 ,..., s;/ = 1, 2 , .. . , s t ) satisfying the 
above condition. Such a / is called the Jordan Canonical Form of A. 

It will be noted that / has the eigenvalues of A as diagonal elements 
and a distribution of zeros and ones on the super diagonal and zeros 
elsewhere in the matrix. 

This is best illustrated by an example, in the case n = 3, every 3x3 
matrix A with complex elements will be similar to one of the following 


X 

0 

0 


1 

X 

0 



1 

X 

0 




1 

X 0 

0 J 


/X 0 
I 0 X 

\o 0 




Exercises 6.3 

1. When possible, find an invertible matrix P such that P~ l AP is a 
diagonal matrix if A is 
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(Vi) / 3 
3 


\ 


1 

■1 

1 

0 


-2 2 \ 
-2 2 


2 -1 -1 


0 1 


(vii) / 1 
0 


(i) 

1 -1 -1\ 00 

1 ° 

-L^ 

(iii) 

(-2 


1-1 0 

1 2 

1 


1 


\1 0 -11 

\2 2 

3/ 


l 0 

(iv) 

2-i 0 i \ 

(v) /-I 

-1 -6 

3 \ 


0 1 +i 0 


1 

-2 -3 

0 


^ i 0 2-i/ 


-1 

1 0 

1 


1 -1 -5 


\ 


0 

0 


0 

0 


1 

1 

-1 

0 


1\ 


1 

-1 

-1 


-8 

4 

0 


—12\ 
4 
1 / 


2. Find the eigenvalues and eigenvectors of each of the following 
matrices 



over Q. Show that A is similar to a diagonal matrix and explain why B 
is not. Find an invertible matrix P such that P~ l AP is a diagonal matrix. 


3. Determine the eigenvalues of the matrix 
/1 a a 2 0 \ 

1 a 0 a 2 

A = 

1 0 a a 2 

\o 1 a a 2 l 


Find an invertible matrix P such that P l AP is a diagonal matrix. 

/ 1 1 2 \ 

4. If j4 = I 0 2 1 and Pis an invertible matrix, by considering 


0 0 3 


(P l AP) n or otherwise, find A n , where n is a positive integer. 
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5. Find an invertible matrix P, such that P l AP is a diagonal matrix if 



/-\ 

4 

—2 \ 

A = 

1 

-1 

1 ' 


l 3 

-6 

4/ 


Hence, find a matrix ± A) satisfying X 2 = A. 

6.4 The Minimum Polynomial of a Matrix and the Cayley-Hamilton 
Theorem 

Throughout this section we assume that K is the complex field. 

In §3.4, we saw that ( M n (K ): K ) = n 2 . Thus, if A EM n (K), the set 
{I, A , A 2 ,. . . , A” 2 ) is linearly dependent over K , i.e. there exist 
a 0 , ai,..., a n 2 £ K , not all zero, such that 

a 0 I + aiA + • • • + ~ 0 

If we put 

f(x) = a 0 + a x x + . . . + a n2 x n 

then 


m = o 

If now m is a positive integer such that (I, A , A 2 ,. . ., A m ~ l } is linearly 
independent over K and {I, A, A 2 ,. .. ,A m } is linearly dependent 
over K, then there exist /3 0 , Pi, • • • with j3 m ^ 0 such that 

/3 0 I + M + ...+^ m = o 


or 

7 0 I + 7 i ^4 + . . . + y m _ 1 A m ~ 1 + A m = 0 

where 7 . = (/ = 0, 1,... , m - 1). Hence, there exists a monic 

polynomial/(x) of degree m^n 2 such that/(/l) = 0. We can now give 
the following definition: 

DEFINITION 6.13 A minimum polynomial of a matrix A E M n {K ) 
is the monic polynomial m(x) of least degree such that m (A) = 0. 

The above argument shows that such a polynomial exists and is of 
degree at most n 2 . Our aim will be to obtain further restrictions on the 
degree and form of minimum polynomial which will prove useful in its 
evaluation. 
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The minimum polynomial of a linear transformation T E &{V) can 
be defined in a similar way by using precisely the same arguments in 
the vector space &(V). 

We now prove a series of useful lemmas concerning the minimum 
polynomial. 

LEMMA 6.14 The minimum polynomial of A E:M n {K) is unique. 

PROOF Let 772 (x) and m\x) be minimum polynomials of the 
matrix A. They are clearly of the same degree. Let f(x) = m(x) —m'(x), 
then f(x) is of low£r degree than 772 (x) and m(x) (since both are monic) 
and f(A) = m(A) — m(A) = 0, which leads to a monic polynomial of 
lower degree than 772 (x) satisfying the requirements for a minimum 
polynomial and which contradicts the definition of mix). Thus - 
m (x) — m\x) = 0 , or m (x) = m{x). I 

LEMMA 6.15 The minimum polynomial m (x) of A E M n (K) divides 
every polynomial /(x) such that f (A) = 0. 

PROOF By the division algorithm for polynomials, there exist 
polynomials ^(x), r(x) such that 

f{x) = q(x) m(x) 4- r(x) 

where r(x) = 0 or the degree r(x) < degree m(x). Now 
r(A) = f(A)-q(A)m(A) = 0 

which means that if degree r(x) > 1 , we have a polynomial of lower 
degree than m(x) satisfying the requirements for a minimum polynomial., 
Thus r(x) = 0 and 772 (x) divides /(x). 

LEMMA 6.16 Similar matrices have the same minimum polynomial 

PROOF If A , B E M n {K) are similar, then there exists an invertible 
matrix P such that 


P~ X AP = B 

It is easily shown by induction that for k > 1 
P~ 1 A k P = B k 

and if/(x) = a 0 4- oqx 4- . . . 4- a m x m then 
f(B) = a 0 I 4- a x B 4-... 4-a m B m 

= cc 0 P~ l P 4- a! P~ l AP 4- ... 4- a m P~ l A m P 


= P~ 1 f(A)P 
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Thus if f(A) = 0 then f(B) = 0 and conversely. - 

The result now follows from Lemma 6.15. ■ 

We now prove an important theorem which gives further restrictions 
on the degree of the minimum polynomial, indeed that the degree of 
772 (x)< 72, if A EM n (K). 

THEOREM 6.17 (Cayley-Hamilton Theorem) If x A (*) is the 
characteristic polynomial of A , then x a (A) = 0 or alternatively , 
“every matrix satisfies its characteristic polynomial f \ 

PROOF If A €zM n (K), let the characteristic polynomial of A be 

X^OO = (-l) n (x n + aix n ~ 1 +... + a n ) 

By §2.4, we have 

(A — xl) adj (A — xl) = \A — xl | . I = x A (*) • * 

Now, by considering the form of adj (A — xl), we have 

adj (A — xl) = (p^ix )), where p /y (x) are polynomials of degree at most 

72 — 1 in x. Thus adj (A — xl) may be expressed in the form 

adj 04 —xl) = B 0 + BiX 4- . . . + B n _ l x n ~ 1 

where B 0f B u ..., B n _ i eM n (K). By comparing the coefficients of 
powers of x in 

(A-xl)(B 0 + BiX + .. .+B n _ 1 x n ~ 1 ) = (-1 ) n (x n + a l x n - 1 

+ -..+OI 

we obtain 

AB 0 = (~l) n a n l 
~B 0 + AB l = I 


u n- 2 1 ""n-1 

Sn -1 = (- 1 )" 1 

Premultiplying these equations by I ,A,...,A n respectively and adding 
gives 

0 = (-1 ) n (a n + a n _ 1 A +...+A n ) = x a (A) ■ 

The above results lead immediately to the following useful 

COROLLARY The minimum polynomial of a matrix divides its 
characteristic polynomial. 
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PROOF This follows from the above theorem and Lemma 6.15. ■ 

EXAMPLES 

1. If ^4 = diag(l, 0, —1), then the characteristic polynomial of A is 
A(A — 1)(A 4- 1). By calculating .4 2 = diag(1,0, 1) andx4 3 = diag(l, 0,-1) 
it is clear that no linear relation of lower degree than A 3 — A = 0 is true. 
Thus the minimum polynomial and characteristic polynomial coincide in 
this case. 


2 . 


If ,4 = 



0 

0 

0 


1\ 

0 

0 


, then the characteristic polynomial of A 


is x 3 


and since A 2 = 0, A ¥= 0, the minimum polynomial is x 2 . Thus, the 
characteristic polynomial and minimum polynomial need not be the 
same in general. 


THEOREM 6.18 The distinct linear factors of the minimum 
polynomial coincide with those of the characteristic polynomial 

PROOF Let the characteristic polynomial of A be 

X A (*) = (-1)" (* - Xi) m ‘ (* - X 2 ) OTj X fc ) m * 

where Xj, \ 2 ,... ,X k are the distinct eigenvalues of A. Then by the 
corollary to Theorem 6.17, 

m{x) = (x - Xj)® 1 (x - X 2 f 2 . .. (x - \ k )*f< 

where 0 < £, < m { (i = 1,.. ., k). We must show that £ f > 0 for all 

i = l.fc 

Suppose that for some 1 </ < k, £ ; . = 0, then m(X.) 0. But, if 

X is an eigenvalue of A, then there exists a non-zero X E V n (K), 
such that 


AX = XX 
and by induction 
A k X = X k X 

for k = 1,2, ... and furthermore 
m(A)X = m(X)X 

Thus, if m (X) =£ 0 for some eigenvalue X this would mean that 
m(A) =£ 0 which would contradict the definition of the minimum 
polynomial. Hence, no such eigenvalue A, can exist and £. > 0 for all m 
7 = 1 ,. 
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COROLLARY If a matrix A EM n (K) has n distinct eigenvalues, then 
its minimum and characteristic polynomials coincide. 

We can now state an important criterion for a matrix to be 
diagonalizable in terms of its minimum polynomial. A proof is not 
included. 


THEOREM 6.19 A matrix A &M n (K) with k distinct eigenvalues 
Ai,..., X k is diagonalizable if and only if its minimum polynomial is 
(x - X0 ... (x - X fc ). 

EXAMPLE 


1. If A = 



1 0 
1 0 
0 2 
0 0 



then its characteristic polynomial is clearly 


(x — l) 2 (x — 2) 2 . By Theorem 6.18, the minimum polynomial of A 
must be one of (i) (x — 1) (x — 2); (ii) (x — l) 2 (x — 2) 

(iii) (x — 1) (x — 2) 2 ; (iv) (x — 1 ) 2 (x — 2) 2 . 

The minimum polynomial is now determined by systematically 
eliminating these possibilities. 


( A-I) (A-21 ) 



1 0 
0 0 
0 1 
0 0 



1 0 

-1 0 

0 0 

0 0 




-1 

0 

0 

0 


0 0 
0 0 
0 0 
0 0 


¥= 0 


04-I) 2 (A-2I) 


(0 1 
0 0 
0 0 
\o o 



-1 0 

0 0 

0 0 

0 0 



Thus the minimum polynomial is (x — l) 2 (x — 2) and the matrix A is 
not diagonalizable. 
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3 2 4 

2 0 2 j, then the characteristic polynomial is 

4 2 3 


2. If A = 

—(x + l) 2 (x — 8). We see that 



/4 2 4\ 

/ 

(A + I)(4-8I) = 

2 1 2 

4 2 4/ 

\ 


/” 5 


2 

4 


2 

-8 

2 


4\ 

2 

-5/ 


= 0 


and thus the minimum polynomial of A is (x 4- 1) (x — 8) and so the 
matrix 4 is diagonalizable as we have seen in the example on page 155 

Exercises 6.4 

1. Find the minimum polynomial of the following matrices 


(i) 


1 

-1 



(iii) /I 0 0 

0 1 1 

.0 0 7 , 


(iv) /1 


-4 2 

2 -1 -1 
0-4 3 

2 0-2 


4 \ 

2 

4 

1 


2. Find the minimum polynomial and characteristic polynomial of the 
matrix 


/o 

0 


1 

0 

0 


0 . . . 0 

1 . . . 0 


0 

\*o *1 


0 1 
a 2 • • • a n-ll 

Given a polynomial/(x) of degree n , find an n x n matrix with f(x) as 
its minimum polynomial. 

3. If a matrix ,4 has characteristic polynomial (x — l) 3 (x + 2) 2 (x — 3), 
find all the possible minimum polynomials of A. Find matrices A which 
have these polynomials as their minimum polynomials. 
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4. Give a direct proof of the Cayley-Hamilton Theorem for 
(i) diagonalizable matrices, (ii) triangular matrices. 


5. By calculating the minimum polynomial, determine which of the 
following matrices are diagonalizable: 



(iii) /l 
0 
0 
0 


4 

1 

4 

0 


-2 -4 

0 -2 
-1 -4 

0 -1 



-4 0 

-1 0 
-3 1 

0 0 


5 

1 

4 

1 


\ 

/ 


6.5 The Diagonalization of Symmetric Matrices 

LetM^(R) denote the set of all real symmetric n x n matrices. We 
shall show that in this case, the problem considered in Section 6.3 
can be solved completely. 

DEFINITION 6.20 A, B E M^ S \R) are said to be orthogonally 
similar if there exists an orthogonal matrix P£M n (R) such that 

B = F X AP 

(note that if P is orthogonal P X P — PP X = I„ and so P~ l = P x ). 

We first prove two lemmas. 

LEMMA 6.21 The eigenvalues of a real symmetric matrix are all real 

PROOF Let A EM^ S \R) and X be an eigenvalue of A with 
corresponding eigenvector X G J^(R), then 

AX = XX 

Taking complex conjugates and then transposes we obtain 
AX = XX 


and 


X x A x = XX x 
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Thus, we have 

X x AX = XX'X = \X x X 
from which it follows that 
(A-A)(X*X) = 0 
Now, if X = (*i, x 2f . .., x n ), then 

X x X = x 1 x l 4-... + x n x n = \x x \ 2 + ... + \x n \ 2 i=0 and thus X = X 
and X is real. 


LEMMA 6.22 If the eigenvalues of a real symmetric matrix A are all 
distinct then the corresponding eigenvectors are orthogonal to each 
other. 

PROOF Let X 1? X 2 ,..., X n be the distinct eigenvalues of A with 

corresponding eigenvectors X u X 2 , . .. , X n , i.e. 

i.e. AX i = X f X f (/ = 1,2,. .., n). We wish to show that if i =£ /, 

(X,., Xf) = Xf Xf = 0. 

We have that 

X\A = \X\ 

and 

X x AX j = X i X\X j = \jX\Xj 
or 

(A,. - X j )X t i X j = 0 

from which it follows that XfX- = 0 since A ( - — A • =£ 0. I 

From these two results it now follows that 


THEOREM 6.23 If A is a real symmetric matrix with distinct 
eigenvalues then A is orthogonally similar to a diagonal matrix. 

PROOF Let Xi, X 2 , ... , X w be the eigenvalues of A and X 1} X 2 , ..., X n 

the corresponding eigenvectors. Let Y f = —j— X t {i = 1,2,. .. , n) and 

I Xj | 

put P = (Y u Y 2i ..., Y n ), then P is an orthogonal matrix, since by 
Lemma 6.22 above ( Y -, Yj) = 0 (/ =£/) and ( YY ( ) = 1 (/,/ =1,2,...,«). 
Furthermore, Y u Y 2 , .. . , Y n are also eigenvectors corresponding to the 
eigenvalues X l5 \ 2 ,... ,\ n and by Theorem 6.11 

P'AP = P~ l AP = diag (K u A a ,..., A„) I 
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In the next theorem, we see that this result is even true when the 
eigenvalues are not distinct. 


THEOREM 6.24 Let A he a real symmetric n x n matrix. Then , 
there exists an orthogonal matrix P such that P f AP = diag (X lf X 2 ,..., \) 
where Xj, X 2 ,. . . , X^ are the eigenvalues of A {possibly repeated). 


PROOF The proof is by induction on n. 

If n = 1, the theorem is trivially true. 

If n > 1, we assume that every (n — 1) x (n — 1) real symmetric matrix 
is orthogonally similar to a diagonal matrix. 

Let X be a eigenvalue of A and X the corresponding eigenvector, 

thus AX = XX. Put Xi = —r X, then by the Gram-Schmidt 

\X I 


orthogonalization procedure (see Theorem 5.6) an orthonormaLR-basis 
{X U X 2 ,. . . ,X n } for F^(R) may be constructed. 

Put U = (X i9 X 2 ,. . . , X n ), then U is an orthogonal n x n matrix and 





★ 


r 

0 



, 

B 


\o 


where B is an ( n 

— 1) x {n - 


A 


— 1) matrix. Since A is a symmetric matrix 


and 


(U'AU)* = U t AU 


then U X AU is a symmetric matrix and B is also a symmetric matrix and 


U'AU 



Now, by the induction assumption, there exists an (n — 1) x (n — 1) 
orthogonal matrix V such that 


169 























V X BV = diag(X 2 , • • •, \) 

where X 2 ,. .., \ n are eigenvalues of B , and hence also eigenvalues of A. 
Let 


V 9 


1 

0 

0 

V 


then F' is an n x n orthogonal matrix ana 


V n U t AUV' = V n 


X 

0 

0 

B 


v' 


= diag (X, X 2 ,..., X n ) 

and UV' is an orthogonal matrix, which completes the proof. 
EXAMPLES 

-14 -10^ 

7 —4 1, then the characteristic polynomial 


I 



-4 


19 


X A ( X ) = 


-14 -10 

7 —x —4 
-A 19-x 


10 —x 
-14 
-10 

which on evaluation gives 

X A (x) = (*-18)(x-27)(x + 9) 

and the eigenvalues are X = 18, 27 and —9. 

-8 -14 -10 

-14 -11 -4 

-10 -4 1 

19 -14 -10 

if X =—9, then |—14 16 ^4 

10 -4 28 

-17 

and if X = 27, then | —14 

-10 


If X = 18 then 
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/ 1 2 _2 \ 

If we now let P = ^ —2 2 1 I then Pis an orthogonal matrix and 

\ 2 1 2 / 

P X AP = diag (18, —9,27) 


2 . 



X A ( X ) = 


-1 

7 

2 

1-x 

-1 

-2 


2 then the characteristic polynomial of A is 

10 / 


-1 -2 

1-x 2 


= (x — 6) 2 (x — 12); 


2 10 —x 


the eigenvalues of A are X = 6 and X = 12. When X = 12, then 




= 0 


1 

°\ 

1 

2 


0 


0 -1 


The two vectors (1, 1,0) and (0, 2, —1) are not orthogonal to each 
other, but by applying the Gram-Schmidt orthogonalization procedure 
we obtain 


yi = (UU0) 

y 2 = (0, 2, —1) — (1, 1,0) = (—1, 1,-1) 

Then (1, 1,0) and (1,-1, 1) are orthogonal vectors which are 
eigenvectors corresponding to the eigenvalue X = 6. Now, by 
normalizing each vector and putting 
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Izl 

1 

1 \ 





/V6 

V2 

V3 ' 


/-l 

V3 

y/T\ 

1 

V6 

1 

V2 

-1 

V3 

1 

V6 

1 

V3 

-V2 

1 2 

0 

1 


\ 2 

0 

V2 / 

\V6 

V3 / 






we have that P X AP = diag(12, 6, 6). 


Exercises 6.5 

1. Find an orthogonal matrix U such that U X AU is a diagonal matrix 
for the 

(0 


(iv) 


following matrices /l 







1 

0 


0i) 

/n 

0 

6 ^ 

(hi) 

7 

-1 

— 2\ 

0 

5 

4 


0 

5 

6 


-1 

7 

2 

\-4 

4 

31 


l 6 

6 

-2/ 


1-2 

2 

10 / 

/l8 

-1 

-4 

\ 







-1 

18 

-4 








1-4 

-4 

3, 

/ 








6.6 Quadratic Forms 

Throughout this section we assume that F is a finite dimensional 
R-space. 

DEFINITION 6.25 If J2I — {v ly v 2) . . ., is a R-basis for V then a 
real quadratic form on V is a function Q: V R defined by 


Q(v) = 


n n 

2 2 

(=i /=i 


0.. JC- JC- 

l] l J 


( 1 ) 


n 

where v = 2 x- v i and A = (a..) w a real symmetric matrix, 

i = 1 7 


For example, if ^4 



then Q : V-+R is given by 
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G(z>) = 2x! 2 + 3 x 2 2 4- + x 2 x x 

= 2x x 2 + 3x 2 2 + 2x { x 2 
The expression (1) will usually be written as 

n 

Q(v) = 2 a ij x i x j 

*./ = 1 
/</ 

and the coefficient a^ in this expression is shared equally between the 
(/,/)- and (/, 0“P os iti° ns of the matrix 4. 

If X = (x l5 x 2 , . . . ,x n ) is the co-ordinate vector of v y then Q(v) can 
be expressed in matrix form as 

GO) = XAX* 

We now consider the effect of a change of R-basis on this expression 
for Q. 

Let 38' = {v[, v 2i . . ., v’ n } be another R-basis for F, then (see p.100) 

n 

v' = 2 • v .(j = 1, . .. , «), where P = (/?••) is an invertible matrix. 

7 i=i 7 7 

n 

lfv= 2 y-vj, then 

7=1 

« / n 

v = 2 ( 2 Pij v % 

/=1 Vi =1 

= 2(2 P./P/V,- 

i=l \/ = l / 

and hence 

X t = PY t where K = CKi. 

Hence, 

GO) = X4Z‘ = YP t APY t = r(P*i4P) r* 

where P^P is also a symmetric matrix since (P'/IP)' = P X AP. 

If Z) = diag(X l5 X 2 , • • • , X w ) then G : F -► R defined by 

GO) = x 1 x 1 2 + ... + x„x„ 2 

is a real quadratic form on F which has a simple form called a diagonal 
quadratic form. We consider whether it is possible to find a R-basis for 
F so that a given real quadratic form can be represented as a diagnonal 
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quadratic form. We shall be even more restrictive with a view to later 
geometrical applications; we shall insist that the matrix P involved in 
the transformation is orthogonal. We now of course have precisely the 
same problem as was considered in the previous section and it follows 
immediately from Theorem 6.24 that we have 

THEOREM 6.26 Any real quadratic form may be reduced to a 
diagonal quadratic form by means of an orthogonal transformation. 

This result has a useful and elegant application in analytic geometry. 

The general equation of a conic section is of the form 

ax 2 + 2 bxy + cy 2 + ux +vy +p = 0 (1) 

We call 

ax 2 + 2 bxy + cy 2 

the quadratic form associated with (1). Equation (1) can therefore be 
expressed in the matrix form 

^(b b c){y) + (u ’ v) {y) +p = 0 


Thus, if X = (x,y), A 



T = ( u,v ), we have 


XAX t + TX f + p = 0 

The matrix A is symmetric and by the above theorem, there exists an 
orthogonal matrix P such that 



where Xx, X 2 are the eigenvalues of A. Thus, by carrying out the trans¬ 
formation^ j (or = P t ^ ) the form (1) reduces to 

X , P t APX ft + TPX n + p = 0 
or 

\ x x 2 + X 2 y' 2 + ux + vy + p = 0 (2) 

where p = p and (u\v) = TP. 

Thus, by introducing new axes, called the principal axes Ox' and 0 y 
in the xy-plane in directions X x and X 2 , where P = (X x ,X 2 ) the conic 
section has a graph represented by (2). Equation 2 may be reduced 
further, depending on the values of and X 2 . The following cases are 
considered. 
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n r , V n r , U 

x =x +— , y =y + 


Case 1 X t f 0, X 2 f 0 
If we put 

■' + 

2Xi 

in (2), we obtain 

XiT" 2 + X 2 T " 2 = P" 

where 


2X, 


11 


P = - IP 


uv 

4\j 


If Xi = X 2 , then this is the equation of a circle with centre 
( — — , —— 'j in the xy'-plane and radius \Jp" . If > 0, X 2 > 0 

\ 2Xj 2Xj / 

( r i v 

-j-) 

2Xj 2X 2 / 

in the xy'-plane which meets the new Ox^-axis at the points 

( ± Vp"Ai , o) and the new Oj/'-axis at the points ^0, ± x/p"/X 2 ^ . 

If X x > 0, X 2 < 0, it is the equation of a hyperbola with centre 

( — — , — — ) in the xy '-plane and the vertices are ( ± v/p"/Xx , 0] 

V 2Xx 2X 2 / \ 1 

and the two lines y" !\Jp" /X 2 = ± x"/Vp"/Xx (or \fk 2 y n = ±\/X^x") 
are the asymptotes. 

Case 2 Xx = 0, X 2 f 0 

If we put 

n / , v 

y =y + 


2X^ 


in (2), we obtain 

X 2 y + u x + p = 0 

Further, putting x" = x + p \u , if u f 0, this becomes 

^ " 2 , r ff r\ 

X 2 y + U X =0 
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which is the equation of a parabola with vertex — — ] in the 

xy -plane and focus (— z//4X 2 , 0). The case X 1 f 0, X 2 = 0 can be dealt 

with similarly. There are further “degenerate” cases which we shall not 
consider further here. 

The standard form of these conics are given in Appendix 1. 

This is illustrated in the following examples. The first examples are 
“central” conics. 


EXAMPLES 

1. Draw a graph of the conic section 
5x 2 — 6xy + 5 y 2 = 8 
In the matrix form this is 

5 ~ 3 \ /*' 

-3 si [y. 

The eigenvalues are roots of 


(x,y) 


= 8 


5 —x —3 
—3 5 — x 


= (2 — x) 


1 -3 

0 8 —x 


(2 — x)(8 — x) 


i.e. the eigenvalues are X = 2, 8 and the corresponding eigenvectors are f j J 

/ 1\ ' 
and ) • Normalizing these vectors, we have 


1 /* 1 


and then 
P X AP 


2 0 
0 8 


Now taking new axes Ox', Oy ’ in the directions and 

(^f» res P ectivel y> ( or Pitting x = (x ' 4- y \ y = (x' -y')) 


the equation of the conic becomes 
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2x' 2 + 8 y ,z = 8 
,2 

.0 

1 


x ,2 
or —- r y 



2. Draw a graph of the conic section 
lx 2 — 1 2xy — 2 y 2 = 3 
The corresponding matrix is 
7 -6 

k —6 2 


A = 


which has eigenvalues —5 and 10 and eigenvectors ^ ^ an ^ 


respectively. Let P 


t AD — 


VS\2 

5 0 


;)• 


then 


P l AP 


0 10 
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Taking new axes Ox', Oy’ in the directions (V5 ’ sk) ^ (V^’ V5 ) 
respectively, the equation of the conic becomes 


or 


-5x' 2 + 10 y' 2 = 10 


;2 

X , ,2 

+ y = 1 


^ This may also be obtained by putting x = (x ' — 2y '), 
y = (2x f + y ').^ Thus we have a hyperbola as illustrated in Fig. 3. 



Figure 3 


3. Identify and sketch the graph of the conic section whose equation is 
x 2 +y 2 — 2 xy — 4\/2x + 6 = 0 
The matrix form is 

(x ’ y) (-1 1)0 + (-4V2,0) Q +6 = 0 
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The characteristic polynomial of this matrix is x(x—2) and the eigen¬ 


values are 0 and 2. The corresponding eigenvectors are 



respectively. If we now let 



then, taking new axes Ox', 0 y in the directions (l/>/2 ,l/\/2~) and 
(1/V2 , —1/\/2) respectively the equation of the conic section becomes 


2/ 2 +l/V2(-4^,0)() _})(/) + 6 = ° 
or 


y' 2 — 2y — 2x +3 = 0 
from which we obtain 

O' -l) 2 = 2(x' -1) 

Thus, the conic section is a parabola with vertex (1,1) in (x', jD-plane, 


which has coordinates 1 \\J2 (I _!)(!) * ft) , i.e. (y/2 ,0) i 
the (x, _y)-plane as illustrated in Figure 4. 
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The same method can be used for quadric surfaces in 3-space. The 
general equation of a quadric surface is of the form 


ax 2 + by 2 + cz 2 + 2dxy + 2 exz + 2 fyz + ux + vy + wz + p = 0 
which may be again represented in the matrix form 
XAX x + TX t + p = 0 


where now 


A = 



X=(x,y ) z) i T=(u,v,w) 


This may be reduced in precisely the same way as the conic section — 
details are not given in this case. In Appendix 2, the graphs of the 
non-degenerate surfaces are given. The methods are illustrated in the 
following examples. 


EXAMPLES 

1. Identify the quadric surface whose equation is 

lx 2 4- ly 2 4- 10 z 2 — 2 xy — 4xz 4- 4yz = 24 
The corresponding matrix is 



which has been considered in Example 2 on p. 168. 

The eigenvalues are X = 6, 12. Corresponding to X = 12 the eigenvector 


is ( V^f’ y/6’ anC * corres P onc ^ in g to X = 6 the orthogonal 
eigenvectors are =^, 4). Taking 


new axes 


Oat , Qy and 0z in the direction of these unit vectors respectively (or 
equivalently by substituting 


* = (~ x ' + \/3" y' + \[2 z ') 

y = ^ (x’+ y/3 y'-\[2 z’) 

z = ^ ( 2x ' + Vi*')) 
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the quadric surface takes the form, 
12x' 2 4- 6y' 2 + 6z' 2 = 24 


or 


,2 ,2 

x —+y~ 

2 4 


t2 


= 1 


which is the equation of an ellipsoid. 

2. Identify the quadric surface whose equation is 

2x 2 +y 2 — 3 z 2 4- 12xy — 4xz — 8yz + 8x + 12y + 22z + 44 = 0 
The corresponding matrix form is 

2 6 -2 x \ lx 


( *,y,z ) 



y I + (8,12,22) 1 y 


+ 44 = 0 


6 1 -4 

-2-4 -3 / \ z 

It may be verified that the eigenvalues of this matrix are —3, —6 and 9 
and the corresponding eigenvectors are 

2 \ /-M 2 

-1 , 2 and 2 

7 \ V 

respectively. If we put 

/ 2 -1 
-1 2 


p= i 





and take new axes Ox', 0y', 0z' in the direction of the unit vectors 
\{2— 1,2), ^(-1,2,2), ^(2,2,-1) respectively, then the equation of 
the quadric surface becomes 

3x' 2 + 6y' 2 — 9z' 2 — 16x' — 20 y' — 6z' — 44 = 0 

/ 2 -1 2 


since \ (8,12,22) 




-1 2 2 | = (16,20,6) 

2 2-1 

Now, put x" = x - 8/3, y" =y - 5/3, z" = z' + 1/3; then the equation 
of the quadric surface becomes 

x" 2 +2y" 2 -3 z" 2 = 27 


or 


."2 


y 


"2 


,"2 


(3V3) ; 


3\/3 ' 2 

V2 


3 2 1 
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Thus, we have a hyperboloid of one sheet (see Appendix 2) with centre 

/ 2-1 2 
1/9(8,5,-1) -12 2 

\ 2 2 -1 

with axes in the direction of the unit vectors given above. 

Exercise 6.6 

1. Reduce the following, real quadratic forms to diagonal form 

(i) x 2 4- y 2 4- xy 

(ii) x 2 + y 2 — xy 

(iii) 3x 2 4- 3 y 2 + 5 z 2 — 2 xy 

(iv) 2 xy 4- 2xz 4- 2 yz 

(v) 5x 2 4-11 y 2 - 2z 2 4- 1 2xz 4- 12 yz 

2. Find the principal axes, centre and sketch the graph of the following conics 

(i) *T = 2 

(ii) 3x 2 — 2y 2 4- I2xy = 42 

(iii) lx 2 4- 4 y 2 — 4 xy = 24 

(iv) \3x 2 4- \3y 2 4- lOxy = 72. 

(v) 5x 2 + 26 xy + 5 y 2 — 70x — 38y = 7 

(vi) 73x 2 + 72 xy + 52 y 2 - 190x — 80y = -25 

3. Find the principal axes, centre and identify the quadric surface whose 
equation is 

(i) 2 xy 4- 2 xz 4- 2 yz = 4 

(ii) x 2 4- 2 y 2 4 - z 2 — 2xy — 2yz = 1 

(iii) 4x 2 4- 3 y 2 4- 3z 2 —4xy 4- 4 xz — 6yz = 16 

(iv) 5x 2 4- 2 y 2 4- 2 z 2 4- 4 xy 4- 4 xz 4- 2 yz = 7 

(v) x 2 4- y 2 4- z 2 — 4xy — 4xz — 4yz = 3 

(vi) 1 lx 2 4- 18>> 2 4- 4z 2 — 12 xy 4- 12xz = 22. 

(vii) 7x 2 — 2y 2 + 4z 2 + 16yz — 20xz — 4 xy + 5x + 5y — 2z = 0 

(viii) 7x 2 — 2y 2 — 2z 2 — Syz + lOxz — 10 xy 4 2x + 2z = 0 
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APPENDIX 1 


Conic Sections 




y 



Ellipse 



= 1 


Hyperbola 



Parabola y 2 - 4ax 
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APPENDIX 2 


Quadric Surfaces 




+4 +4 =1 
,2 ^2 c 2 


— y 


Hyperboloid of one sheet 
„2 


+ z! 

* 2 b 2 


= i 


Hyperbolic paraboloid 

J2 


a‘ 


y± = z 

b 2 


y 



>y 


Elliptic paraboloid 


2 ,,2 
?L =z 

a 2 b 2 



Hyperboloid of two sheets 
~2 


z! _ = i 


a' 




Elliptic cone 

* 2 v 2 _ z 2 

a 2 Z ? 2 c 2 


7 
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Solutions to Exercises 


Exercises 1.2 


1.(0 /I 

0 

0 

5/8 \ 

00 /l 

0 

°\ 

0 

1 

0 

-1/8 

0 

1 

0 

\o 

0 

1 

1/8 / 

0 

0 

1 




/ 

\o 

0 

0 / 


Oii) 



(iv) /l 


0 

0 

0 

0 


0 

1 

0 

0 

0 


1 

1 

0 


0 

0 

1 


0 0 
0 0 


2 0 
0 1 
2 1 
0 0 
0 0 


(v) 


2. (i) Yes 



0 

1 

0 

0 



Exercise 1.3 

1. General solutions are: 

(i) (-\-2n~3v, 3X—4/i —14 5X, 5/z, 5v) 

(ii) (-9X-7/i, 19X+35ju, 7X-49/X, 14X, 14/x) 

(iii) (X,—X,4X,3X) 

(iv) (3X— 6 / 1 , 7X—9jU+2^,3X,3M,3^,3^) 
where X, ju, ^ are arbitrary. 
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2. General solutions are: 


(i) (4—2X—2/z,9—4X—3/i,X,ju) 

(ii) No solution 

(iii) (-1 -X+m, 7/2+1 /2X—5/2/x,—1 /2-1 /2X+1 /2 ju,X;aO 

(iv) No solution 
where X,ju are arbitrary. 

3. (a) (i) X 9^ 8; (ii) X = 8, p 9^ 8; (iii) X = 8, /i = 8 

(b) (i) X 9^ 11; (ii) X = 11, fi =£ 3; (iii) X = 11, n = 3 

4. X 9^2, X 9^ 1 

when X = 1,2a — 3b +c = 0, (2a-b-fi, b-a, /jl ) 
when X = 2, 2a-3b+c = 0, (b-a-fjL,p — b+2a), ji arbitrary 

5. (i) X = -14,(—l+/z,3—2ju, m);X9^-14,(0, 1, 1) 

(ii) X =£ 1/2; X = — 1, (—2+/i, 1, *0; X # 1/2, X * -1, , 

(—14X 2 +5X+12,8X 2 -4X-6,3-2X) 1 ; 

(iii) X = l,(l/3, -2/3) 

(iv) X 2 +2X = a , (—+2+/i, 3(7— 1 — /jl, i ±, (7— 1), /jl arbitrary 

6. (—X—2/.7, 1—X, X, 1—2^, jLi), X, /z arbitrary 

7. 72 = 8, X(1 ,—1,0,1,—1,0,1,—1), X arbitrary 
72 = 9, (0,0,0,0,0,0,0,0,0) 


Exercises 1.4 


3. a = 2, b = 1/4, c = —1/64 or a = —2, Z? = —1/4, c = 1/64 
5. If^=(a ;7 )thena.„. / . + 1 = a„. i+1J O', 7=1, • ■., n) 
Exercises 1.5 


4. ^4 (a)" 1 = ^4(-o:);x 3 — 3x 2 +3x—1 = 0 


6 . ± 


V2 


1 1 

1 -1 


+ 


1 / 1 


V2 -1 
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Exercises 1.6 


i. 0) 


% 


aii) 


(v) 


(vii) 




2 / -2 
— 2 — 2 / - 2 / 
— 1—7 1—7 


1+7 

-1+7 

0 


Exercises 1.7 

i. o) 


2 

i) 


(ii) 

| 

In 

-9 

i\ 

0 

i 



i 

-7 

9 

-2 

-2 

-0 



1 

2 

-3 

1/ 

-2 

19 \ 


(iv) 


2 

3 

l\ 

5 

-10 



1 

2 5 

17 

13 

-29 

13 

-w 




9 

1 

-8/ 

2 

10 

-3\ 

(vi) 


None 


0 

5 

5 






-1 

0 

4 






2 

5 

-3 / 







(ii) 


P= 1 

L 3 


p = L 

7 


3 

-2 

3 

-1 

2 


Q = 


h 
o 


0 

1 


0 0 


2 

4 

-1 


2 = 1 


i\ 

-l 

l l 
0 0 
7 0 
0 7 
0 0 



(iii) 


5 3 

-1 

0 


D — 1 

4 0 

4 

0 


P 'l2 

-7 3 

-1 

0 



\ i o 

0 

1 

(iv) 


(0 -4 

2 

-4 


P= — 

0 2 

6 

2 


1 4 

0 3 

2 

-4 



U 1 

0 

1 


0=L 


/ 


Q = 


_ J_ 
1 4 


14 0 0 -4 -6 

0 14 0 2 -4 

0 0 14 -11 -1 

0 0 0 14 0 

\ 0 0 0 0 14 


(P and Q are not uniquely determined.) 
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2. {(i), (iii)} , {(ii)> , {(iv), (v)} 


Exercises 2.1 

1. (i) 7, (ii) 12, (iii) -18, (iv) -6, (v) 0 

2. (i) ( a-b) (a-c) (b-c ) (a+Z?+c) 

(ii) (a-&) (a-c) (Z?-c) (a+Z?+c) (a 2 +Z? 2 +c 2 ) 

(iii) — 2(a— b) (a-c) (b-c) ( a+b+c ) (a 2 +b 2 +c 2 ) 

(iv) 8a (x—y) (x—z) (y—z) (x 2 +y 2 +z 2 +a 2 ) 

(v) (a-b) (a-c) (b—c) (a +b+c) (a 2 +b 2 +c 2 ) 

3. 6 = nn— 7t/ 12 or mr + l7rl\2 (n = 0,1,2,...) 

4. 0, 6—c, (a+Z?+c)/2 

Exercises 2.2 

1. (0 5, (ii) 11, (iii) 2 

2. a,b,c, - (a+b+c) 

3. (i) 0, (ii) ±4 

5 . ( — 1 )” n Gt -\\0L2 n OL?, n — 1 * * * ^nn 

6 . 2 3k 

Exercises 2.3 

1. (i) is non-singular 

2. (a—(3) 2 fa-7) 2 f/3-7) 2 (a+0+7) 

3. K-c)'^ 3 -27c) 

5. /a b c\ 

C = i c a b \ 

\ b c a J 
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6. (i) Yes (ii) No 


7. (-5/3,-12/3,-16/3); p=2, (10/3 +p, 2/3 +p, p); p = -3, (5+p, 4+p, p) 
jLi arbitrary 

8. (i) k - — 1, k = —4, (ii) k 4 ±1, k 4 -4, (iii) k = 1 


Exercises 2.4 
l.(i) 


00 


(iii) 


3. 


i 

—3 

5 

9 

2- 
72 | 

18 

-6 

18 

6 

14 

-18 


3 

-1 

-2 

-0 

3 

-4 

1 

1 

l- 9 

-3 

3 


( u 

-9 

1 

1 

-7 

9 

-2 


1 2 

-3 

1 

(1,0,0) 

(ii) 1 

178 

±1, 


1 

/ 1 
-t 

-t 

(1- 

- 1 2 ) 


(41,17,169) 


5. 


x 2 -x 

-2 


-1 


x 2 -x-l 
—(x+1) —(x+1) 


-z 1 

1 


-f \ 

1 
1 


x+1 \ 

2(x+l) 

(x+1) 2 


, x = ±l, 


1 ±V 3i 


Exercises 3.3 

1. (i) and (v) 

2. (ii), (iii) and (iv) 

3. (i), (iii) and (iv) 

4. (i), (iii) and (vi) 

5. (ii) and (iii) 
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Exercises 3.4 


2 . ( 1 , 2 , 1 ), ( 0 , 1 ,— 2 ) 

5. e.g. {(1,0,1,1), (1,0,2,4), (1,0,0,0), (0,1,0,0)} 

7. e.g. {(1,1,0,—1), (4,—2,1,0), (1,0,0,0), (0,1,0,0)} 



9. e.g. {(1,0,-1,0), (0,1 ,-1,0), (1,0,0,0), (0,0,1 ,-l)}, {(0,1,0,0)} 

10. e.g. {(2,1,0,0), (-1,0,1,0), (3,0,0,1)} 

12. —4 Ui + 

13. (i) Yes (ii) No (iii) Yes 

14. k-No, v-Yes, e.g. {(2,-1,3,2), (-1,1,1,3), (3,-1,0,-1), (1,0,0,0)} 

{(1,0,4-1), (2,-1,3,2), (1,0,0,0), (0,1,0,0)} 

15. e.g. {(1,0,—1,1), (2,—1,0,1), (1,3,3,2)} , SU{y} 

16. 4,8; 2,4 

17. {(0,1,1)} ; {(1,1,0), (/, 1+1,1)} 

18. {1 -t 2 +f 3 ,t+t 2 -t 3 } , {l-t 2 +t 3 ,2+t-t 2 +t 3 , l+t-t 2 +t 3 ,t 2 } 


Exercises 4.1 

1. (i) Yes (ii) No 

2. (i) Yes (ii) Yes 

3. (i) Yes (ii) Yes 


(iii) No (iv) No 
(iii) Yes (iv) No 
(iii) No (iv) Yes 


(v) Yes 
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Exercises 4.2 


1. (i) / 1 -1 0 \ 

12-1 

\ 2 1 1 

2. I 3 0 2 \ 

0-1 0 

1-2 0 -2 




-11 12 \ 
1 0 
4 -6 


4. 


If 5 = 


/ Sl 1 
\S21 


5 12 

S 22 


matrices relative to { e n , e 12 , e 2 i, e 22 } are 


0) 

Sll 

0 

S 12 

0 


0 

S 1 1 

0 

S 12 


5 21 

0 

S 22 

0 



S 21 

0 

s 22 

(iii) 

1° 

~ s 21 


s 12 


~ S 12 

( s ll~ 

~ 5 22 ) 

0 


$21 0 ($22 ^11 ) 

0 S 2 i —5 12 





6. 

( 3 

X+2 

1 \ 


1 +X+/i 




^ l+XHju 2 

X 2 +n 2 +X/i 2 

M 2 / 

7. 

{x 2 ,y 2 ,z 2 

, xy, xz, yz } 



/ s xl 

5 21 

0 

0 

S 12 

S 22 

0 

0 

0 

0 

S 1 1 

S 21 

\° 

0 

5 12 

s 22 1 


0 \ 


s 12 
~ s 21 



(i) \4= 1, 1, \=£ jj. 

(ii) X = ji = 2 


a 2 

0 

0 

0 

0 

0 \ 

1 

P 2 

0 

p 

0 

0 \ 

1 

y 2 

0 

y 

0 

0 

2a: 

0 

0 

aP 

0 

0 

2a 

0 

0 

ay 

0 

0 / 

1 

207 0 

(0+7) 0 

0/ 
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8. {\,x,y,x 2 ,xy,y 2 } , {l,x,x\x 3 } 

/ 1 0 2 0 0 4 \ 

0 1 0 0 2 0 \ 

0 0 0 0 ■ 0 0 
0 0 0 1 0 0 

0 0 0 0 0 0 / 

\ 0 0 0 0 .0 0 / 


Exercises 4.3 

1. e.g. {(—1,2,3,0), (1,0,0,—1)} , {(1,1,0) 5 ( 1 5 2,3)} 


2. 2,2 
3 , 


If P = 


1 1 
0 1 
0 0 
0 0 


1 1 
1 1 
1 1 
0 1 


1 

1 

1 

1 


\ 


0 


1 

1 


/ 



1-1 1-1 
0 1-23 

0 0 1-3 

0 0 0 1 


(-If 

(_!)«+,(«) 

(-!)"(") 

(-l n+1 (")) 


€ -^n+i (R) 



0 0 


0 0 
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then 


(a) 0) 


A = 


/° 

n 


1 

0 0 


0 

2 


0 0 0 


0 0 0 
0 0 0 


0 0 
0 0 


0 

0 0 


a / 


00 Q' l AQ 


(b) (i) 


5 = 


a 

o 

o 


l 

l 

o 


(iii) P AP 

1 . . 

2 . . 

1 . . 


0 0 


*0 0 0 


(?) 

(?) 

(?) 


1 / 


(ii) Q-'BQ 
4. 2,2; x=y = 0 


(iii) P' l BP 


5. ker S = {a\a G Rj , im S = 2 a- ; . x i y^\oc ij G R 

v j, /=o 

1< /+/<« 

ker T- {a+px+yy+8xy |a,0,7,6 G R} 


im T = 2 x l yi la f/ . G R, a x i =0 

U /=0 

2<z +/<« 


{l,xj>,;t 2 ,xyj' 2 ,. .. • • • , xy n ~ 1 ,y n } 
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diag (A 0 Ai., A ) where A , = diag (i, (i = 0,1,..., n) 

v - - V 

0 + 1 ) copies 

diag (B 0 ,B 1 ,...,B n ) where= diag (z 2 -i(2k+l)+2k 2 )z-0,1 

k= 0 , 1 ,..., i 

6 . (i) R (ii) {/(x)|/( w ) (x) = 0} /?>1 (iii) <e*> 

7. (i) !0|,:W,(R), Oil (0},« a (R), (iii) j(" a.bE R , 

{U’«) h eR ) 

8 . p—q = n—m 

9. 1 

11. V=V 2 (R\T(a,b) = (0,a),a,bER. No 

Exercises 4.4 

1. X ^ 1/8 

2. 7 , (a 1 ,a 2 ,a 3 )= 1/3 (ai +a 2 +a: 3 ,3a 2 +3a 3 , -oq +5a 2 + 20 : 3 ) 

4. ker D 4= 0 

Exercises 4.5 

2. (i) a 4=1, —14 rank 4; a = 1 rank 2\ a - —14 rank 3 

(ii) a 4= 0,1,21/20 rank 4; a - 0 rank 2; a = 1 rank 3; a = 21/20 rank 3 

3. 2,2 

4. t = 0, — 1; t = 0 rank 2; f = —1 rank 2 

5. rank 3 unless p=y=0 when rank 1 or a=0, f3=y4 : 0 when rank 2 
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Exercises 5.1 

1. (ii) and (iii) 

2. (0,2,—1) 

3. (i) y/6 (ii) \/38 (iii) \[2 

4. (i) cos" 1 6/y/ 42, (ii) n/2 
Exercises 5.2 

1. (h) 

2. (i) 

3. (i), (ii) and (iii) 

4. None 

5. (i) 3,3,Vl4, -2, cos" 1 -2/9 

(ii) >/l7,V^,\/r05,25,cos" 1 25/V7^ 

(iii) l/y/3, 1 /x/2, \/(57r 2 —4)/67 t 2 , -2/r 2 , cos" 1 

7 t 2 

6. Sum of squares of diagonals equals sum of squares of sides 
Exercises 5.3 

2. (i) <(1,1,0,0), (2,0,1,0), (—1,0,0,1 )> 

(ii) <(l+z, 1— i)> 

3 . 1 — 6x+6x 2 , 1 /y/2, 6ac + 3 ad + 3 be + 2bd = 0 

4. \[2 sin i nx, (i = 1,2,.. ., n) 

5. (i) {(1,-1,1), (4,5,1), (2,-1,-3)} 

(ii) {(1,-1,1,1),(0,1,0,1),(3,1,-1,-1)} 

(iii) { (1 ,-l, 0 , (1+4/, 2-i, 5+0 } 
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6. (i) {(1 A/2,0,1 A/2), (0,1,0), (1 /V2,0,-l/y/2)} 

(ii) {1 /2(1, i,l, /), 1 /2(i,l ,i,l), 1 A/2(l ,0-1,0), 1 A/2(0,l ,0,-1)} 

7. {(1,0,0),(1-1,0),(0-1,1)} 

8. {1,2x—1,6.x 2 —6x+l, 20x 3 —30x 2 +12x—1} 

9. (i) {e ly e 2 ,e 3 } ;(1 ,*,—*) 

(ii) (l,l/V3(2x-l), l/V5(6x 3 -6x+l)} ; (4/3,^3/2^6, (5/6,0,V5/6) 
10. {l,x,x 2 —1/5,x 3 — 3/7x} 

12. They are perpendicular 

Exercises 6.2 

1. (i) (a) None (b) None (c) 1 ±\/5V 

(ii) (a) -1 (b) -1 (c) -1 ,±i 

(iii) (a) 1 (b) 1,±V^ (c) 1, ±V2, ± i 

2. (i) (1—x) (2—x) (3—x); 1, < (1,—1,0) >; 2,< (2,—1,—2) >; 

3,< (1,—1,—2) > 

(ii) (1-x) 3 ; 1,< (1,1,1) > 

(iii) —x(x 2 —2x—7); 0, (1,2+/,—1), —1 

(iv) -(1-x) 2 (l+x)(3-x); 1,< (1,0,0,-1), (0,1,-1,0) >; 

—1, < (1,—1,—1,1) >, 3, < (1,1,1,1) > 

(v) (x—1 ) 2 (x-2)(x-3);l,<(l,-1,0,0), (1,0,-1,0)> 

2,<(—2,4,1,2)>,3,<(0,3,1,2)> 

(vi) (x—1) (x 2 —2) (x 2 +l); 1, <(1,0,1 ,-1,0 )>;v^,<(-1,\^,1,0,0)>; 

-y/2, <(1 ,V2,-1,0,0)>; i, <(1,0,0-1 +/, 1 -ip; -i, <(1,0,0-1 -i, 
l +/)> 

4. For /= 1,2,.. . , n, if X { is an eigenvector corresponding to \ then X t is 
an eigenvector corresponding to (i) 1 and (ii) X k (k = 1,2,. ..) 

5. 0, < (1,0,. . . , 0)> 
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7. Reflection: 1, < (cos 0, sin 0) >; -1, < (sin 0, -cos 0) > 
Rotation: None 

8. (i) (—I/ 2-1 (<*! + oc 2 x+ .. . +a n x n ~ l —x n ) 

(ii) ( b-x) n ~ l (1 . .. +a n ~ l +Z?-jc) 

Exercises 6.3 

1. (i) / 0 1+/ 1 —i 

1 1 1 | ,P' l AP = diag (—1,/,—/) 

\-l 1 1 

(ii) / 1 2 1 

P =(-1 -1 -1 ) ,P' l AP = diag (1,2,3) 

0 -2 -2 


(iii) / 4 
P = 


4 

1 0 
\ 0 - 1 


1 I ,P' 1 AP = diag (0,1,2) 
0 


(iv) II 0 1 

P= I 0 1 0] ,P ml AP = diag (2,l+z,2—2z) 

1 0 -1 

(v) Not diagonalizable 


(vi) 


P = 



0 

0 

1 

1 


2 \ 

2 

2 

-1/ 


,P' 1 AP = diag (0,1,2,-1) 


(vii) Not diagonalizable 


2. A: 1,< (3,5)>; 3,< (1,1 )>;P = 


B: 2,<(!,-!)> 


3. 


P = 


'a 2 

a 2 


0 

a 

-1 

0 


a‘ 

-a 

~a 

1 


i\ 


,P' 1 AP = diag (i a,a,-a,\+a+a 2 ) 
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4 / 2 2 rt+1 —2 3«+i-2« +1 -l 

1/2 0 2 W+1 2.3"-2 " +1 

\0 0 2.3" 


5. 


P = 


2 

1 

0 


1 

0 

-1 



/ 1 0 
X = i> 0 -1 

\ 0 0 


°\ 

0 P '* 

0 




Exercises 6.4 

1 . (i) x (x— 2 ); (ii) (x-l ) 2 (x- 2 ); (iii) (x-l) (x- 2 ); (iv) (x-l ) 2 


2. X A ( x ) = m A ( x ) = xn ~ a n-1 xn ~ 1 ~ a n-2 X ” 2 ~ • • • 

3. (x—1) (x+2) (x— 3), (x—1) (x+2 ) 2 (x—3), (x—l ) 2 (x+2)(x-3), 
(x—l ) 2 (x+2 ) 2 (x-3), (x—l ) 3 (x+2) (x—3), (x-l ) 3 (x+2 ) 2 (x—3) 

5 . (i) , (iii) diagonalizable 


Exercises 6.5 


1 . 0 ) 

1/3 



2 

-1 

2 


(iii) 






2 

3 

-6 



1 A /6 \ (iv) 

—i A/6 1/VTs 

-2/V6 / 


1 1 3 2\/2 \ 

1 -3 2>/2 

\4 0 -V 2 / 


Exercises 6.6 

1. (i) 3/2x 2 + l/2.y 2 ; (ii) 3/2x 2 + l/2.y 2 ; (iii) 2x 2 + 4j > 2 + 5z 2 ; 

(iv) 2x 2 — j / 2 -z 2 ; (v) 7^c 2 — 7_y 2 + 14z 2 
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2. (i) 1 A/2 (1,1), 1 A/2 (1 -1 );jc 2 +^ 2 = 4; 

(ii) 1A/T3 (3,2), 1A/T3 (2,—3);x 2 /6 -y 2 \l = 1; 

(iii) 1A/5 (1,2), 1 A/5 (2,-l);x 2 /8 +^ 2 /3 = 1; 

(iv) 1A/2 (1,1), 1 /v^(1,-1);jc 2 /4 +j 2 /9 = 1; 

(v) 1 A/2 (1,1), 1 A/2 (1 -1); (1 /2, -5/2); jc 2 /4 -^ 2 /9 = 1; 

(vi) 1/5 (3,4), 1/5 M,3); (7/5, -l/5);x 2 /4 +j 2 = 1 

3. (i) 1/V3 (1,1,1), 1/V2(1 -1,0), 1/V6 (1,1-2); 2x 2 -y 2 -z 2 =l 

(ii) l/V3(l,l,l),l/V2(l,0,-l),lA/6(l,-2,l);y 2 + 3z 2 =1 

(iii) 1A/6 (2,1,1), 1/V3 (1,-1,1), 1/V2 (0,1,1 ); x 2 /8 + j 2 /2 = 1 

(iv) 1/V2 (0,1-1), 1 A/3 (1,-1,—1), 1 A/§"(2,l,l);x 2 /7 +j> 2 /7 + z 2 = 1 

(v) l/\/2 (1,—1,0), 1 A/6 (1,1,—2), 1 A/3"(l,l,l);x 2 + >> 2 -z 2 = 1 

(vi) 1/11 (7,6,6), 1/11 (6,-9,2), 1/11 (6,2,—9); 2x 2 +y 2 = 1 

(vii) 1/3 (1,-2,2), 1/3 (-2,1,2), 1/3 (2,2,1); 2z = 6j> 2 - 3x 2 

(viii) 1A/3 (1,1-1), 1 A/2 (0,1,1), 1 A/6 (2,-1,1); (—1/12,1/8,1/24); 
x 2 + 2y 2 — 3z 2 = -1/72 
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Index 


Addition: 

of linear transformation, 115 
of matrices, 16 
of vectors, 66, 67 
Adjoint matrix, 59 
Angle between vectors, 135 
Associative law of: 
addition, 20 
multiplication, 20 
Augmented matrix, 9 
Axes, principal, 174 

Basis (AT-), 76-88 
change of, 99-104 
coordinates relative to, 82 
orthogonal, 137 
orthonormal, 137 
standard, 79 

C[a, b ], 72 
Canonical form, 36 
Cauchy-Schwarz inequality, 134 
Cay ley-Hamilton Theorem, 163 
Change of K -basis, 99—104 
Characteristic polynomial: 

of a linear transformation, 153 
of a matrix, 150 
Cofactor, 44 
Column: 

equivalent matrices', 33 
of a matrix, 3 
rank, 118,146 
Commutative law: 
of addition, 20 
of multiplication, 19 
Complement (orthogonal), 145 
Conic section, 174—179 
Consistent solution, 9 
Coordinates, 82 
Cramer-solution, 61 

Determinant: 

2 x 2, 3 x 3, 38—42 
computation of, 48-50 
definition of, 43 


existence of, 45—46 
function, 43 
of a matrix, 43 
of a product, 55 
of elementary matrices, 53 
of linear independent vectors, 77 
of the transpose, 55 
row and column expansion, 48 
uniqueness of, 46—47 
Vandermonde, 50 
Diagonal matrix, 22 
Diagonalizable: 

linear transformation, 156 
matrix, 156 
Diagonalization: 

of a matrix, 156-160 
of a real symmetric matrix, 
164-169 
Dimension, 83 
Direct sum, 144 
Directed line segment, 63 
Direction cosines, 128 
Distance between vectors, 128, 134 
Distributive law, 20 
Dot product, 129 

Echelon matrix, 4 
reduced, 4 
Eigenvalue, 149 

of a real symmetric matrix, 
167-168 
Eigenvector, 149 
Eigenspace, 153 
Elementary: 

column operations, 33—35 
matrix, 26-33 
row operations, 3-8 
Equality of matrices, 16 
Equations, linear (see System of linear 
equations) 

Equivalence relations, 36, 147, 158 
Equivalent matrices, 36 
column, 33 
row, 4, 121—122 
systems, 9 
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Euclidean space, 131 
Field: 

of complex numbers C, 63 
of rational numbers Q, 63 
of real numbers R, 63 
Finite dimensional vector space, 83 
Finitely generated space, 75 
Form: 

canonical, 36 
diagonal quadratic, 173 
Jordan canonical, 159 
normal, 36 
real quadratic, 172 

Gauss elimination method, 5 

Generated, 75 

Gram-Schmidt Orthogonalization 
Procedure, 139 

Homogeneous system of linear 

equations (see System of linear 
equations) 

Homomorphism (.AT-), 91 
Identity: 

linear transformation, 94 
matrix, 22 

Image of a linear transformation, 106 
Inequality: 

Cauchy-Schwarz, 134 
triangle, 134 
Inner product, 129, 131 
space, 131 ^ 
standard, 132 
Inverse: 

of a matrix, 23, 30—32, 58—61 
of an elementary row operation, 27 
Invertible: 

linear transformation, 111 
matrix, 22, 30-32, 100,121 
Isomorphism (AT-), 110 

Jordan canonical form, 156 

AT-basis, 76-88 

standard for V n (K), 79 
AT-homomorphism, 91 
AT-isomorphic, 114 
AT-isomorphism, 111 
AT-linear combination, 75 
AT-space, 67 

Kernel of a linear transformation, 106 


Length of a vector, 127,132 
Linear: 

combination, 75 
dependent, 76 

equations (see System of linear 
equations) 

independent, 76, 100 
A-algebra, 117 
Linear transformation: 

characteristic polynomial of, 153 

definition of, 91 

eigenvalue of, 149 

eigenvector of, 149 

examples of, 92-94 

image of, 106 

invertible, 111 

kernel of, 106 

matrix of, 96-99 

minimal polynomial of, 162 

non-singular, 111 

nullity of, 106 

rank of, 106 

M mn (K),M n (K),69 

Matrices: 

addition of, 16 
column equivalent, 33 
elementary, 26-33 
elementary operations on, 3-8, 
33-35 

equality of, 16 
equivalent, 36 
multiplication of, 18 
row equivalent, 4, 121—122 
scalar multiplication of, 17 
similarity of, 102, 147, 152, 162 
Matrix: 

adjoint, 59 
augmented, 9 

characteristic polynomial of, 150 

coefficient, 9 

column rank of, 118,146 

columns of a, 3 

definition of a, 3 

determinant of a, 43 

diagonal, 22 

echelon, 4 

eigenvalue of, 149 

eigenvector of, 149 

element of, 3 

elementary, 26 

identity, 22 

inverse of, 23, 30-32, 58—61 
invertible, 22, 30—32, 100, 121 
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minimal polynomial of, 161 
non-singular, 23, 54 
of a linear transformation, 96-99 
orthogonal, 25 

rank of a, 118—125, 144—146 

reduced echelon, 4 

row rank of a, 146 

rows of a, 3 

singular, 23 

skew-symmetric, 25 

symmetric, 25 

transpose of a, 23 

zero, 16 

Minimal polynomial: 

of a linear transformation, 162 
of a matrix, 161 
Minor, 44 

Non-singular matrix, 23, 54 

Norm of a vector, 132 

Normal form, 36 

Normalized vector, 132 

Nullity of a linear transformation, 106 

Operations: 

elementary column, 33—35 
elementary row, 3-8 
Orthogonal: 

complement, 145 
matrix, 25 
set, 137 

vectors, 129, 135, 137-142 
Orthogonally similar, 167 
Orthonormal set, 137 

P(K),P n (K ), 68 
Parallelogram rule, 64 
Perpendicular vectors, 129, 135 
Polynomial: 

characteristic, 150, 153 
minimal, 161-162 
Principal-axes theorem, 174 
Product, inner (dot), 129 

Quadratic forms, 172-182 
diagonal, 173 
matrix of, 173 
Quadric surface, 180—182 

Rank: 

column, 118 

of a linear transformation, 106 
of a matrix, 118-125, 144—146 
row, 121 

Reduced echelon matrix, 4 


Reflection, 92 
matrix of a , 98 
Rotation, 92 
matrix of a, 99 
Row: 

equivalence, 4 

equivalent matrices, 4, 121-122 
operators (elementary), 3 
rank of a matrix, 121, 146 
Rows of a matrix, 3 

Scalar multiplication, 17, 66, 68 
Similarity of matrices, 62, 147, 152, 
162 

Singular matrix, 23 
Skew-symmetric matrix, 25 
Solution: 

of a system of linear equations, 9 
space, 72, 73 
trivial, 9 
Space: 

Euclidean, 131 
unitary, 131 

vector (see Vector space) 
Standard: 

inner product, 132 
K -basis, 79 
Subspaces, 71-75 
direct sum of, 144 
finitely generated, 75 
generated by a set, 75 
intersection of, 73, 87—88 
proper, 71 
sum of, 74, 87-88 
union of, 73 
Sum: 

of linear transformations, 115 
of matrices, 16 
of subspaces, 74 
Symmetric matrix, 25 
real, 167-171 

System of linear equations, 1-2, 
9-16, 122-125 
augmented matrix of, 9 
consistent, 9 

Cramer-solution of, 60-61 
equivalent, 9 
homogeneous, 9, 30, 55 
matrix of coefficients of, 9 
solution of, 9 
solution space of, 72, 73 
trivial solution of, 9 

Transformation: 
identity, 94 
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linear, 91 
zero, 94 

Transpose of a matrix, 23, 55 

Triangle inequality, 134 

Unit vector, 127, 132 

Unitary space, 131 

Vandermonde determinant, 50 

Vector space: 

definition of, 67 

dimension of, 83 

elementary properties of, 69-70 

examples of, 68—69 

finite dimensional, 83 

finitely generated, 75 

AT-basis of a, 77 


Vectors: 

addition of, 64, 66 
angle between, 135 
direction cosines of, 128 
distance between, 128, 134 
length of, 127, 132 
norm of, 132 

orthogonal (perpendicular), 129, 
135,137-142 
unit 127, 132 
zero, 67 
V n (K ), 66 

Zero: 

element, 67 
matrix, 16 
vector, 67 
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A.O. Morris 

This book is intended as a first and elementary introduction to linear 
algebra and matrix theory for first and second year undergraduate 
students. It is based on a first course given at The University College of 
Wales, Aberystwyth over a number of years. The approach emphasises 
the computational and practical aspects of the subject, but at the same 
time aims to give a thorough and rigorous introduction to the subject at a 
level suitable for use in the introduction of abstract concepts in mathe¬ 
matics. The book is structured to give the more concrete and practical 
aspects first, leading naturally to the more abstract ideas developed later. 
The main emphasis of the first two chapters is to provide efficient and 
effective techniques for solving linear equations, expanding determinants, 
manipulating matrices, etc. The later chapters are more abstract in char¬ 
acter, but student motivation is maintained by considering the relevant 
geometrical concept in 2 or 3 dimensions. The aim is to show that the 
definitions of these abstract concepts are the natural and useful exten¬ 
sions of what occurs in lower dimension. In this second edition more 
examples of an analytical nature have been included. The final section on 
applications to analytical geometry has been extended and some sections 
have been altered to improve their clarity. Numerous exercises are in¬ 
cluded in each section, with answers at the end of the book. 

Professor Morris obtained his B. Sc. and Ph.D. degrees from the University 
College of North Wales, Bangor. He then became Assistant Lecturer at the 
University College of Wales, Aberystwyth, where he is now Professor and 
Head of Department of Pure Mathematics. He was Visiting Assistant 
Professor at the University of Illinois in 1964 - 1965. He is a member of the 
London Mathematical Society and of the American Mathematical Society, 
and served for four years on the Council of the London Mathematical 
Society. His research interests are mainly in group representations and 
combinatorics. 
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